<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src, branch v3.8.6</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/afr: When failing fop due to lack of quorum, also log error string</title>
<updated>2016-11-11T12:23:54+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-11-08T11:54:41+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=0f97f880b0d01baf94e10950e63e4bb823ed2e40'/>
<id>0f97f880b0d01baf94e10950e63e4bb823ed2e40</id>
<content type='text'>
        Backport of: http://review.gluster.org/#/c/15800/

Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081
BUG: 1393630
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15813
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/#/c/15800/

Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081
BUG: 1393630
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15813
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/shard: Fill loc.pargfid too for named lookups on individual shards</title>
<updated>2016-11-09T07:28:19+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-11-07T10:36:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=ce95b633e859d90edcbd2868da73d7ff392defaf'/>
<id>ce95b633e859d90edcbd2868da73d7ff392defaf</id>
<content type='text'>
        Backport of: http://review.gluster.org/#/c/15788/

On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:

Brick-logs:

[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==&gt; (Invalid argument) [Invalid argument]

Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]

Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.

Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb
BUG: 1392846
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15796
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/#/c/15788/

On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:

Brick-logs:

[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==&gt; (Invalid argument) [Invalid argument]

Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]

Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.

Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb
BUG: 1392846
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15796
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr,ec: Heal device files with correct major, minor numbers</title>
<updated>2016-10-27T06:23:45+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-10-26T01:21:18+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6e18d90b218dfa3d6ecdea8cb4f8a7ce56bde74a'/>
<id>6e18d90b218dfa3d6ecdea8cb4f8a7ce56bde74a</id>
<content type='text'>
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.

 &gt;BUG: 1384297
 &gt;Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15728
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
BUG: 1388948
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15735
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.

 &gt;BUG: 1384297
 &gt;Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15728
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
BUG: 1388948
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15735
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Prevent dict_set() on NULL dict</title>
<updated>2016-10-17T10:04:45+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-09-30T11:46:43+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1f4a350ed67b383ff7dd8c04c5b49ed0db2e7cb4'/>
<id>1f4a350ed67b383ff7dd8c04c5b49ed0db2e7cb4</id>
<content type='text'>
In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.

 &gt;BUG: 1385104
 &gt;Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15646
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

 &gt;BUG: 1385236
 &gt;Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15650
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1385442
Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15651
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.

 &gt;BUG: 1385104
 &gt;Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15646
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

 &gt;BUG: 1385236
 &gt;Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/15650
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1385442
Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15651
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: Take full locks in arbiter only for data transactions</title>
<updated>2016-10-15T16:26:08+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-10-14T10:39:08+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=a1ee61051c6d284b9e632b975227c07cb4dda93d'/>
<id>a1ee61051c6d284b9e632b975227c07cb4dda93d</id>
<content type='text'>
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.

Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.

&gt; Reviewed-on: http://review.gluster.org/15641
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)

Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1375125
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Max Raba &lt;max.raba@comsysto.com&gt;
Reviewed-on: http://review.gluster.org/15647
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.

Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.

&gt; Reviewed-on: http://review.gluster.org/15641
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)

Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1375125
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Max Raba &lt;max.raba@comsysto.com&gt;
Reviewed-on: http://review.gluster.org/15647
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: Ignore gluster internal (virtual) xattrs in metadata heal check</title>
<updated>2016-09-29T07:10:18+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-09-27T05:09:58+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=44dbec60a2cd8fe6a68ff30cb6b8a1cf67b717be'/>
<id>44dbec60a2cd8fe6a68ff30cb6b8a1cf67b717be</id>
<content type='text'>
Backport of http://review.gluster.org/#/c/15548/

Problem:
In arbiter configuration, posix-xlator in the arbiter brick always sets the
GF_CONTENT_KEY in the response dict with a value 0. If the file size on the
data bricks is more than quick-read's max-file-size (64kb default), those
bricks don't set the key. Because of this difference in the no. of dict
elements, afr triggers metadata heal in lookup code path, in turn leading to
extra lookups+inodelks.

Fix:
Changed afr dict comparison logic to ignore all virtual xattrs and the on-disk
ones that we should not be healing.

Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308
BUG: 1377193
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15578
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Backport of http://review.gluster.org/#/c/15548/

Problem:
In arbiter configuration, posix-xlator in the arbiter brick always sets the
GF_CONTENT_KEY in the response dict with a value 0. If the file size on the
data bricks is more than quick-read's max-file-size (64kb default), those
bricks don't set the key. Because of this difference in the no. of dict
elements, afr triggers metadata heal in lookup code path, in turn leading to
extra lookups+inodelks.

Fix:
Changed afr dict comparison logic to ignore all virtual xattrs and the on-disk
ones that we should not be healing.

Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308
BUG: 1377193
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15578
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: copy loc before passing to syncop</title>
<updated>2016-08-23T15:08:42+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-02T09:49:00+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=036c4fcab82d4e69bf3e53d93f7da9b0b1dd900c'/>
<id>036c4fcab82d4e69bf3e53d93f7da9b0b1dd900c</id>
<content type='text'>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1369042
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15233
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1369042
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15233
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T10:22:36+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=d99f72842595306e9f26a275804bf0f310caba53'/>
<id>d99f72842595306e9f26a275804bf0f310caba53</id>
<content type='text'>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

&gt; Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
&gt; BUG: 1363721
&gt; Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/15080
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
&gt; Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)

Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

&gt; Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
&gt; BUG: 1363721
&gt; Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/15080
&gt; Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
&gt; Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)

Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Bug fixes in txn codepath</title>
<updated>2016-08-17T10:22:15+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-08-05T06:48:05+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=62ad0bf74a97c20fb06161df3b2dd89f100bf617'/>
<id>62ad0bf74a97c20fb06161df3b2dd89f100bf617</id>
<content type='text'>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I726b2acd4025e2e75a87dea547ca6e088bc82c00
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15164
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I726b2acd4025e2e75a87dea547ca6e088bc82c00
BUG: 1367272
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15164
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: some coverity fixes</title>
<updated>2016-07-28T13:54:49+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-07-12T04:37:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=823eb274a3c4226aea44f6feb955a5df04aae190'/>
<id>823eb274a3c4226aea44f6feb955a5df04aae190</id>
<content type='text'>
Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.

Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.

Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
