<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src/afr-common.c, branch v3.8.3</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T10:22:36+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=d99f72842595306e9f26a275804bf0f310caba53'/>
<id>d99f72842595306e9f26a275804bf0f310caba53</id>
<content type='text'>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

&gt; Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
&gt; BUG: 1363721
&gt; Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/15080
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
&gt; Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)

Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

&gt; Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
&gt; BUG: 1363721
&gt; Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/15080
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
&gt; Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)

Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: some coverity fixes</title>
<updated>2016-07-28T13:54:49+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-07-12T04:37:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=823eb274a3c4226aea44f6feb955a5df04aae190'/>
<id>823eb274a3c4226aea44f6feb955a5df04aae190</id>
<content type='text'>
Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.

Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.

Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Unwind with xdata in inode-write fops</title>
<updated>2016-06-13T10:22:31+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-05-31T09:19:33+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1cd0e86cea9a6d3e52340cfa33622bfb4b9ce4d6'/>
<id>1cd0e86cea9a6d3e52340cfa33622bfb4b9ce4d6</id>
<content type='text'>
When there is a failure afr was not unwinding xdata to xlators above.
xdata need not be NULL on failures. So it is important to send it
to parent xlators.

 &gt;Change-Id: Ic36aac10a79fa91121961932dd1920cb1c2c3a4c
 &gt;BUG: 1340623
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14567
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;

BUG: 1342178
Change-Id: Idd74d2bc898fe5aef537ab48c1754510030c8825
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14618
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When there is a failure afr was not unwinding xdata to xlators above.
xdata need not be NULL on failures. So it is important to send it
to parent xlators.

 &gt;Change-Id: Ic36aac10a79fa91121961932dd1920cb1c2c3a4c
 &gt;BUG: 1340623
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14567
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;

BUG: 1342178
Change-Id: Idd74d2bc898fe5aef537ab48c1754510030c8825
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14618
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Unwind xdata_rsp even in case of failures</title>
<updated>2016-06-10T15:36:21+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-05-27T10:17:07+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=de56d9591ed94fc6f77e6f97ea6bbfaeae8e19fd'/>
<id>de56d9591ed94fc6f77e6f97ea6bbfaeae8e19fd</id>
<content type='text'>
DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir
failures because of stale layout. But AFR was unwinding null xdata_rsp in case
of failures. This was leading to mkdir failures just after remove-brick. Unwind
the xdata_rsp in case of failures to make sure the response from brick reaches
dht.

 &gt;BUG: 1340623
 &gt;Change-Id: Idd3f7b95730e8ea987b608e892011ff190e181d1
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14553
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
 &gt;Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;

BUG: 1342178
Change-Id: Iaacadcad0f76979fb250bd008b8e43f0e7acf642
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14617
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir
failures because of stale layout. But AFR was unwinding null xdata_rsp in case
of failures. This was leading to mkdir failures just after remove-brick. Unwind
the xdata_rsp in case of failures to make sure the response from brick reaches
dht.

 &gt;BUG: 1340623
 &gt;Change-Id: Idd3f7b95730e8ea987b608e892011ff190e181d1
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14553
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
 &gt;Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;

BUG: 1342178
Change-Id: Iaacadcad0f76979fb250bd008b8e43f0e7acf642
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14617
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Do not inode_link in afr</title>
<updated>2016-05-25T10:31:27+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-05-19T10:54:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6c88889e333fcaaa8ddcd136f480108007e339c1'/>
<id>6c88889e333fcaaa8ddcd136f480108007e339c1</id>
<content type='text'>
Race is explained at
https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0

This patch also handles performing of self-heal with shd-pid.
Also performs the healing with this-&gt;itable's inode rather than
main itable.

 &gt;BUG: 1337405
 &gt;Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14422
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;

BUG: 1337870
Change-Id: Ifb476eeed2ff73a44e481d64074599ab0707c725
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14455
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Race is explained at
https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0

This patch also handles performing of self-heal with shd-pid.
Also performs the healing with this-&gt;itable's inode rather than
main itable.

 &gt;BUG: 1337405
 &gt;Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14422
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;

BUG: 1337870
Change-Id: Ifb476eeed2ff73a44e481d64074599ab0707c725
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14455
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Refresh inode for inode-write fops in need</title>
<updated>2016-05-24T21:42:30+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-05-16T09:35:36+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=a770c7bba13734602b11a750e037cb11e42fe706'/>
<id>a770c7bba13734602b11a750e037cb11e42fe706</id>
<content type='text'>
Problem:
If a named fresh-lookup is done on an loc and the fop fails on one of the
bricks or not sent on one of the bricks, but by the time response comes to afr,
if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(),
this will lead to inode-ctx for that inode to be not set, this can lead to EIO
in case of a transaction as it depends on 'readable' array to be available by
that point.

Fix:
Refresh inode for inode-write fops for the ctx to be set if it is not already
done at the time of named fresh-lookup or if the file is in split-brain where
we need to perform one more refresh before failing the fop to check if the file
is still in split-brain or not.

 &gt;BUG: 1336612
 &gt;Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14368
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;

BUG: 1337822
Change-Id: I0f904ebaa78b99cbb11546e08c9fc1562e9a3eef
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14449
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
If a named fresh-lookup is done on an loc and the fop fails on one of the
bricks or not sent on one of the bricks, but by the time response comes to afr,
if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(),
this will lead to inode-ctx for that inode to be not set, this can lead to EIO
in case of a transaction as it depends on 'readable' array to be available by
that point.

Fix:
Refresh inode for inode-write fops for the ctx to be set if it is not already
done at the time of named fresh-lookup or if the file is in split-brain where
we need to perform one more refresh before failing the fop to check if the file
is still in split-brain or not.

 &gt;BUG: 1336612
 &gt;Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14368
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;

BUG: 1337822
Change-Id: I0f904ebaa78b99cbb11546e08c9fc1562e9a3eef
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14449
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Handle non-zero source in heal-info decision</title>
<updated>2016-05-14T14:12:38+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-05-12T08:25:44+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=91dcbfb58ac06b84940d3de9049514465a3acd6b'/>
<id>91dcbfb58ac06b84940d3de9049514465a3acd6b</id>
<content type='text'>
        Backport of http://review.gluster.org/14302

Problem:
Spurious entries are reported in heal info when the mount is on second/third
brick of the replica pair because local-child is given preference in selecting
source. The code is supposed to suggest the file needs heal if the (source &lt; 0)
(failure code path), but instead it is written as if any non-zero value
is considered failure.

Fix:
Treat +ve source as success case

BUG: 1335433
Change-Id: Iede983b6560622964e91306405587da3f1de5748
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14303
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of http://review.gluster.org/14302

Problem:
Spurious entries are reported in heal info when the mount is on second/third
brick of the replica pair because local-child is given preference in selecting
source. The code is supposed to suggest the file needs heal if the (source &lt; 0)
(failure code path), but instead it is written as if any non-zero value
is considered failure.

Fix:
Treat +ve source as success case

BUG: 1335433
Change-Id: Iede983b6560622964e91306405587da3f1de5748
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14303
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Entry self-heal performance enhancements</title>
<updated>2016-04-30T01:21:56+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2015-10-14T08:44:51+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=84c8cc9c5936a2a7539f343c180f06312c8f6d39'/>
<id>84c8cc9c5936a2a7539f343c180f06312c8f6d39</id>
<content type='text'>
Change-Id: I52da41dff5619492b656c2217f4716a6cdadebe0
BUG: 1269461
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12442
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I52da41dff5619492b656c2217f4716a6cdadebe0
BUG: 1269461
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12442
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: propagate child up event after timeout</title>
<updated>2016-04-27T07:35:19+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2015-12-23T08:19:14+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=3c35329feb4dd479c9e4856ee27fa4b12c708db2'/>
<id>3c35329feb4dd479c9e4856ee27fa4b12c708db2</id>
<content type='text'>
Problem: During mount, afr waits for response from all its children before
notifying the parent xlator. In a 1x2 replica volume , if one of the nodes is
down, the mount will hang for more than a minute until child down is received
from the client xlator for that node.

Fix:
When parent up is received by afr, start a 10 second timer. In the timer call
back, if we receive a successful child up from atleast one brick, propagate the
event to the parent xlator.

Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d
BUG: 1054694
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11113
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: During mount, afr waits for response from all its children before
notifying the parent xlator. In a 1x2 replica volume , if one of the nodes is
down, the mount will hang for more than a minute until child down is received
from the client xlator for that node.

Fix:
When parent up is received by afr, start a 10 second timer. In the timer call
back, if we receive a successful child up from atleast one brick, propagate the
event to the parent xlator.

Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d
BUG: 1054694
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11113
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fix spurious entries in heal info</title>
<updated>2016-04-20T11:51:20+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-03-31T09:10:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=b6a0780d86e7c6afe7ae0d9a87e6fe5c62b4d792'/>
<id>b6a0780d86e7c6afe7ae0d9a87e6fe5c62b4d792</id>
<content type='text'>
Problem:
Locking schemes in afr-v1 were locking the directory/file completely during
self-heal. Newer schemes of locking don't require Full directory, file locking.
But afr-v2 still has compatibility code to work-well with older clients, where
in entry-self-heal it takes a lock on a special 256 character name which can't
be created on the fs. Similarly for data self-heal there used to be a lock on
(LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks
before examining heal-state.  If it doesn't take sh-domain locks, then there is
a possibility of heal-info hanging till self-heal completes because of
compatibility locks.  But the problem with heal-info taking sh-domain locks is
that if two heal-info or shd, heal-info try to inspect heal state in parallel
using trylocks on sh-domain, there is a possibility that both of them assuming
a heal is in progress. This was leading to spurious entries being shown in
heal-info.

Fix:
As long as there is afr-v1 way of locking, we can't fix this problem with
simple solutions.  If we know that the cluster is running newer versions of
locking schemes, in those cases we can give accurate information in heal-info.
So introduce a new option called 'locking-scheme' which if it is 'granular'
will give correct information in heal-info. Not only that, Extra network hops
for taking compatibility locks, sh-domain locks in heal info will not be
necessary anymore. Thus it improves performance.

BUG: 1322850
Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13873
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Locking schemes in afr-v1 were locking the directory/file completely during
self-heal. Newer schemes of locking don't require Full directory, file locking.
But afr-v2 still has compatibility code to work-well with older clients, where
in entry-self-heal it takes a lock on a special 256 character name which can't
be created on the fs. Similarly for data self-heal there used to be a lock on
(LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks
before examining heal-state.  If it doesn't take sh-domain locks, then there is
a possibility of heal-info hanging till self-heal completes because of
compatibility locks.  But the problem with heal-info taking sh-domain locks is
that if two heal-info or shd, heal-info try to inspect heal state in parallel
using trylocks on sh-domain, there is a possibility that both of them assuming
a heal is in progress. This was leading to spurious entries being shown in
heal-info.

Fix:
As long as there is afr-v1 way of locking, we can't fix this problem with
simple solutions.  If we know that the cluster is running newer versions of
locking schemes, in those cases we can give accurate information in heal-info.
So introduce a new option called 'locking-scheme' which if it is 'granular'
will give correct information in heal-info. Not only that, Extra network hops
for taking compatibility locks, sh-domain locks in heal info will not be
necessary anymore. Thus it improves performance.

BUG: 1322850
Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13873
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
