<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster, branch v3.7.15</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T10:05:08+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=febaa1e46d3a91a29c4786a17abf29cfc7178254'/>
<id>febaa1e46d3a91a29c4786a17abf29cfc7178254</id>
<content type='text'>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15222
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15222
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: copy loc before passing to syncop</title>
<updated>2016-08-17T11:44:11+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-02T09:49:00+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=318aacabbc482bcc2e1686988a77ad0bc054837e'/>
<id>318aacabbc482bcc2e1686988a77ad0bc054837e</id>
<content type='text'>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1367305
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15168
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.

Fix:
Copy loc to frame-&gt;local and use it's address.

&gt; Reviewed-on: http://review.gluster.org/15070
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;

BUG: 1367305
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Signed-off-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Reviewed-on: http://review.gluster.org/15168
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Bug fixes in txn codepath</title>
<updated>2016-08-17T10:22:53+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-08-05T06:48:05+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e65e066c4f993aac626112e718ee66d35d15c6a8'/>
<id>e65e066c4f993aac626112e718ee66d35d15c6a8</id>
<content type='text'>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15162
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
        Backport of: http://review.gluster.org/15145

AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.

Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df
BUG: 1367270
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15162
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/dht: initialize cbk before attempting inode-link</title>
<updated>2016-08-17T06:28:07+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2016-06-13T06:56:24+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2440dace036a4955ca60d833b2ae514bab679126'/>
<id>2440dace036a4955ca60d833b2ae514bab679126</id>
<content type='text'>
Otherwise inode-link failures in selfheal codepath will result in a
crash.

This regression was introduced in master as fix to 1334164. But, that
patch never made into 3.7. Hence, in essence this patch is 3.7 version
of fix to 1334164, minus the regression.

&gt; Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022
&gt; BUG: 1345748
&gt; Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/14707
&gt; Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Poornima G &lt;pgurusid@redhat.com&gt;
&gt; Reviewed-by: Susant Palai &lt;spalai@redhat.com&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;

Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a6022
BUG: 1366483
Reviewed-on: http://review.gluster.org/15163
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise inode-link failures in selfheal codepath will result in a
crash.

This regression was introduced in master as fix to 1334164. But, that
patch never made into 3.7. Hence, in essence this patch is 3.7 version
of fix to 1334164, minus the regression.

&gt; Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022
&gt; BUG: 1345748
&gt; Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
&gt; Reviewed-on: http://review.gluster.org/14707
&gt; Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Poornima G &lt;pgurusid@redhat.com&gt;
&gt; Reviewed-by: Susant Palai &lt;spalai@redhat.com&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;

Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a6022
BUG: 1366483
Reviewed-on: http://review.gluster.org/15163
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dht/rebalance: allocate migrator thread pool dynamically</title>
<updated>2016-08-05T10:55:39+00:00</updated>
<author>
<name>Susant Palai</name>
<email>spalai@redhat.com</email>
</author>
<published>2016-08-04T07:01:24+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=59378892641a10ba0268a044310264e52afe8ea0'/>
<id>59378892641a10ba0268a044310264e52afe8ea0</id>
<content type='text'>
Problems: The maximum number of migratior threads created was static set
to "40". And the number of these threads get created in rebalance depends
on the number of cores user has. If the number of cores exceeds 40, a
crash or memory corruption can be seen.

Fix: Make the migratior thread pool dynamic.

&gt; Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
&gt; BUG: 1362070
&gt; Reviewed-on: http://review.gluster.org/15000
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
(cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b)

Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
BUG: 1362070
Signed-off-by: Susant Palai &lt;spalai@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15062
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problems: The maximum number of migratior threads created was static set
to "40". And the number of these threads get created in rebalance depends
on the number of cores user has. If the number of cores exceeds 40, a
crash or memory corruption can be seen.

Fix: Make the migratior thread pool dynamic.

&gt; Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
&gt; BUG: 1362070
&gt; Reviewed-on: http://review.gluster.org/15000
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
&gt; Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
(cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b)

Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
BUG: 1362070
Signed-off-by: Susant Palai &lt;spalai@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15062
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Unlock stale locks when inodelk/entrylk/lk fails</title>
<updated>2016-07-30T01:04:22+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-06-11T13:13:42+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=4a49d3116b00811b7a4ba98f51bb27da0fa63d5c'/>
<id>4a49d3116b00811b7a4ba98f51bb27da0fa63d5c</id>
<content type='text'>
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.

 &gt;Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
 &gt;BUG: 1344836
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14703
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Tested-by: mohammed rafi  kc &lt;rkavunga@redhat.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

BUG: 1361402
Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15040
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.

 &gt;Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
 &gt;BUG: 1344836
 &gt;Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
 &gt;Reviewed-on: http://review.gluster.org/14703
 &gt;Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
 &gt;Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
 &gt;Tested-by: mohammed rafi  kc &lt;rkavunga@redhat.com&gt;
 &gt;NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
 &gt;CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;

BUG: 1361402
Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15040
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: some coverity fixes</title>
<updated>2016-07-28T13:54:36+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-07-12T04:37:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=53b54f1a03a5fb93266f66453d228258598f8ef6'/>
<id>53b54f1a03a5fb93266f66453d228258598f8ef6</id>
<content type='text'>
Backport of http://review.gluster.org/#/c/14895/

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360549
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15017
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Backport of http://review.gluster.org/#/c/14895/

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360549
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15017
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Handle absence of keys in some callback dict</title>
<updated>2016-07-27T07:00:04+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2016-06-17T12:22:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1e3a8f47cd88c39c41519d143b001d45387eb4b8'/>
<id>1e3a8f47cd88c39c41519d143b001d45387eb4b8</id>
<content type='text'>
Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion  kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.

Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.

master-
http://review.gluster.org/#/c/14761/

Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1360152
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15012
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion  kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.

Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.

master-
http://review.gluster.org/#/c/14761/

Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1360152
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15012
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix race in timer cancellation</title>
<updated>2016-07-18T06:29:05+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2016-06-13T10:42:47+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=74d2aaf51c7ff601e4394cad9f8e23092267af55'/>
<id>74d2aaf51c7ff601e4394cad9f8e23092267af55</id>
<content type='text'>
A race in timer cancellation for delayed unlock could cause a crash
if the cancelling thread fails to cancel the timer because it has
already been fired but not executed, and the callback is scheduled
out of the CPU, delaying it until the thread has released important
resources needed by the callback.

This patch improves the handling of this case to make it robust.

Backport of:
&gt; Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
&gt; BUG: 1345855
&gt; Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
&gt; Reviewed-on: http://review.gluster.org/14712
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;

Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
BUG: 1346156
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/14724
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A race in timer cancellation for delayed unlock could cause a crash
if the cancelling thread fails to cancel the timer because it has
already been fired but not executed, and the callback is scheduled
out of the CPU, delaying it until the thread has released important
resources needed by the callback.

This patch improves the handling of this case to make it robust.

Backport of:
&gt; Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
&gt; BUG: 1345855
&gt; Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
&gt; Reviewed-on: http://review.gluster.org/14712
&gt; Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
&gt; CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
&gt; Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;

Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
BUG: 1346156
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/14724
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr:Don't wind reads for files in metadata split-brain</title>
<updated>2016-06-27T07:13:36+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-02-05T09:40:06+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e4ea25e9eea0f7259c11333f7a75049f3dccb7a7'/>
<id>e4ea25e9eea0f7259c11333f7a75049f3dccb7a7</id>
<content type='text'>
Backport of http://review.gluster.org/#/c/13389/

Problem: For a read on  a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local-&gt;readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.

Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().

Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349881
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14791
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Backport of http://review.gluster.org/#/c/13389/

Problem: For a read on  a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local-&gt;readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.

Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().

Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349881
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14791
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
