<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src, branch v3.7.19</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>afr: use accused matrix instead of readable matrix for deciding heals</title>
<updated>2016-12-28T09:15:27+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-12-23T07:11:13+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2892fb43027b2c3c39b9e3b32ea99a3a090c0297'/>
<id>2892fb43027b2c3c39b9e3b32ea99a3a090c0297</id>
<content type='text'>
Problem:
afr_replies_interpret() used the 'readable' matrix to trigger client
side heals after inode refresh. But for arbiter, readable is always
zero. So when `dd` is run with a data brick down, spurious data heals
are are triggered. These heals open an fd, causing eager lock to be
disabled (open fd count &gt;1) in afr transactions, leading to extra FXATTROPS

Fix:
Use the accused matrix (derived from interpreting the afr pending
xattrs) to decide whether we can start heal or not.


&gt; Reviewed-on: http://review.gluster.org/16277
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f)

Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9
BUG: 1408820
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16299
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
afr_replies_interpret() used the 'readable' matrix to trigger client
side heals after inode refresh. But for arbiter, readable is always
zero. So when `dd` is run with a data brick down, spurious data heals
are are triggered. These heals open an fd, causing eager lock to be
disabled (open fd count &gt;1) in afr transactions, leading to extra FXATTROPS

Fix:
Use the accused matrix (derived from interpreting the afr pending
xattrs) to decide whether we can start heal or not.


&gt; Reviewed-on: http://review.gluster.org/16277
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f)

Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9
BUG: 1408820
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16299
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: When failing fop due to lack of quorum, also log error string</title>
<updated>2016-11-11T12:23:39+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-11-08T11:54:41+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6a401f152b109e1d9efb5a7801648d79eba06d3b'/>
<id>6a401f152b109e1d9efb5a7801648d79eba06d3b</id>
<content type='text'>
        Backport of: http://review.gluster.org/#/c/15800

Change-Id: I1e73b4518bcf26196d6326065ad404f878e70bd4
BUG: 1393631
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15814
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/#/c/15800

Change-Id: I1e73b4518bcf26196d6326065ad404f878e70bd4
BUG: 1393631
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15814
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/shard: Fill loc.pargfid too for named lookups on individual shards</title>
<updated>2016-11-09T06:39:14+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-11-07T10:36:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=c5d2002ac2e68dc8aa7d54a32e9bd441b9b8bcfa'/>
<id>c5d2002ac2e68dc8aa7d54a32e9bd441b9b8bcfa</id>
<content type='text'>
        Backport of: http://review.gluster.org/#/c/15788/

On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:

Brick-logs:

[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==&gt; (Invalid argument) [Invalid argument]

Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]

Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.

Change-Id: I34b9f90d0f09766b6d87b3994d5cd7a77b622dcb
BUG: 1392853
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15797
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/#/c/15788/

On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:

Brick-logs:

[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==&gt; (Invalid argument) [Invalid argument]

Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]

Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.

Change-Id: I34b9f90d0f09766b6d87b3994d5cd7a77b622dcb
BUG: 1392853
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15797
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr,ec: Heal device files with correct major, minor numbers</title>
<updated>2016-10-27T06:23:20+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-10-26T01:21:18+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=24adb0607683cfec784933f332252a4cd53b8cb7'/>
<id>24adb0607683cfec784933f332252a4cd53b8cb7</id>
<content type='text'>
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.

 &gt;BUG: 1384297
 &gt;Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15728
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

Change-Id: I28636a741592335cebcaa1abc2af8460ebc740e1
BUG: 1388949
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15736
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.

 &gt;BUG: 1384297
 &gt;Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15728
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

Change-Id: I28636a741592335cebcaa1abc2af8460ebc740e1
BUG: 1388949
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15736
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: Take full locks in arbiter only for data transactions</title>
<updated>2016-10-16T06:11:12+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-10-14T10:39:08+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=7f8916364ff2092e0391b294051dc76764ff02fc'/>
<id>7f8916364ff2092e0391b294051dc76764ff02fc</id>
<content type='text'>
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.

Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.

&gt; Reviewed-on: http://review.gluster.org/15641
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)

Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1385226
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Max Raba &lt;max.raba@comsysto.com&gt;
Reviewed-on: http://review.gluster.org/15649
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.

Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.

&gt; Reviewed-on: http://review.gluster.org/15641
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)

Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1385226
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Max Raba &lt;max.raba@comsysto.com&gt;
Reviewed-on: http://review.gluster.org/15649
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T10:05:08+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=febaa1e46d3a91a29c4786a17abf29cfc7178254'/>
<id>febaa1e46d3a91a29c4786a17abf29cfc7178254</id>
<content type='text'>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15222
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15222
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: copy loc before passing to syncop</title>
<updated>2016-08-17T11:44:11+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-02T09:49:00+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=318aacabbc482bcc2e1686988a77ad0bc054837e'/>
<id>318aacabbc482bcc2e1686988a77ad0bc054837e</id>
<content type='text'>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1367305
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15168
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1367305
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15168
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Bug fixes in txn codepath</title>
<updated>2016-08-17T10:22:53+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-08-05T06:48:05+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e65e066c4f993aac626112e718ee66d35d15c6a8'/>
<id>e65e066c4f993aac626112e718ee66d35d15c6a8</id>
<content type='text'>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15162
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15162
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: some coverity fixes</title>
<updated>2016-07-28T13:54:36+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-07-12T04:37:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=53b54f1a03a5fb93266f66453d228258598f8ef6'/>
<id>53b54f1a03a5fb93266f66453d228258598f8ef6</id>
<content type='text'>
Backport of http://review.gluster.org/#/c/14895/

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360549
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15017
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Backport of http://review.gluster.org/#/c/14895/

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360549
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15017
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr:Don't wind reads for files in metadata split-brain</title>
<updated>2016-06-27T07:13:36+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-02-05T09:40:06+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e4ea25e9eea0f7259c11333f7a75049f3dccb7a7'/>
<id>e4ea25e9eea0f7259c11333f7a75049f3dccb7a7</id>
<content type='text'>
Backport of http://review.gluster.org/#/c/13389/

Problem: For a read on  a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local-&gt;readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.

Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().

Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349881
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14791
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Backport of http://review.gluster.org/#/c/13389/

Problem: For a read on  a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local-&gt;readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.

Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().

Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349881
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14791
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
