<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/ec/src/ec-locks.c, branch v3.9.0rc2</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T09:38:36+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=fcb5b70b1099d0379b40c81f35750df8bb9545a5'/>
<id>fcb5b70b1099d0379b40c81f35750df8bb9545a5</id>
<content type='text'>
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Unlock stale locks when inodelk/entrylk/lk fails</title>
<updated>2016-06-14T10:48:54+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-06-11T13:13:42+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=d25237709278f9530c7a3989a37254c628539375'/>
<id>d25237709278f9530c7a3989a37254c628539375</id>
<content type='text'>
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.

Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
BUG: 1344836
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14703
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: mohammed rafi  kc &lt;rkavunga@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.

Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
BUG: 1344836
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14703
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: mohammed rafi  kc &lt;rkavunga@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix tracking of good bricks</title>
<updated>2015-08-06T17:12:22+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2015-08-05T21:42:41+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=7298b622ab39c2e78d6d745ae8b6e8413e1d9f1a'/>
<id>7298b622ab39c2e78d6d745ae8b6e8413e1d9f1a</id>
<content type='text'>
The bitmask of good and bad bricks was kept in the context of the
corresponding inode or fd. This was problematic when an external
process (another client or the self-heal process) did heal the
bricks but no one changed the bitmaks of other clients.

This patch removes the bitmask stored in the context and calculates
which bricks are healthy after locking them and doing the initial
xattrop. After that, it's updated using the result of each fop.

Change-Id: I225e31cd219a12af4ca58871d8a4bb6f742b223c
BUG: 1236065
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/11844
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bitmask of good and bad bricks was kept in the context of the
corresponding inode or fd. This was problematic when an external
process (another client or the self-heal process) did heal the
bricks but no one changed the bitmaks of other clients.

This patch removes the bitmask stored in the context and calculates
which bricks are healthy after locking them and doing the initial
xattrop. After that, it's updated using the result of each fop.

Change-Id: I225e31cd219a12af4ca58871d8a4bb6f742b223c
BUG: 1236065
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/11844
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Minimize usage of EIO error</title>
<updated>2015-07-28T11:12:17+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2015-07-21T16:05:06+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=8d915d196fc591b141bb5267e16453d18dff7955'/>
<id>8d915d196fc591b141bb5267e16453d18dff7955</id>
<content type='text'>
Change-Id: I82e245615419c2006a2d1b5e94ff0908d2f5e891
BUG: 1245276
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/11741
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I82e245615419c2006a2d1b5e94ff0908d2f5e891
BUG: 1245276
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/11741
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Propogate correct errno in case of failures</title>
<updated>2015-07-15T00:05:00+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2015-07-12T13:07:43+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e4fa8568478279c1884c54b9a369655ffa559d4a'/>
<id>e4fa8568478279c1884c54b9a369655ffa559d4a</id>
<content type='text'>
- Also remove internal-fop setting in create/mknod etc xattrs.

Rebalance was failing because ec was giving EIO when lock acquiring fails as
the file/dir doesn't exist. Posix_create/mknod are not setting config xattr
because internal-fop key is present in dict and setxattr for this fails leading
to failure in setting rest of xattrs.

Change-Id: Ifb429c8db9df7cd51e4f8ce53fdf1e1b975c9993
BUG: 1242254
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11639
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Also remove internal-fop setting in create/mknod etc xattrs.

Rebalance was failing because ec was giving EIO when lock acquiring fails as
the file/dir doesn't exist. Posix_create/mknod are not setting config xattr
because internal-fop key is present in dict and setxattr for this fails leading
to failure in setting rest of xattrs.

Change-Id: Ifb429c8db9df7cd51e4f8ce53fdf1e1b975c9993
BUG: 1242254
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11639
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: Porting messages to new logging framework</title>
<updated>2015-06-26T15:51:59+00:00</updated>
<author>
<name>Nandaja Varma</name>
<email>nandaja.varma@gmail.com</email>
</author>
<published>2015-04-30T08:58:10+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=87af7e72d8be95ac0f2ade88f3a9ba16392fd158'/>
<id>87af7e72d8be95ac0f2ade88f3a9ba16392fd158</id>
<content type='text'>
Change-Id: Ia05ae750a245a37d48978e5f37b52f4fb0507a8c
BUG: 1194640
Signed-off-by: Nandaja Varma &lt;nandaja.varma@gmail.com&gt;
Reviewed-on: http://review.gluster.org/10465
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ia05ae750a245a37d48978e5f37b52f4fb0507a8c
BUG: 1194640
Signed-off-by: Nandaja Varma &lt;nandaja.varma@gmail.com&gt;
Reviewed-on: http://review.gluster.org/10465
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Prevent unnecessary self-heals</title>
<updated>2015-05-15T08:24:51+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2015-05-13T11:27:49+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=503acdb32ca84102d07cd1142eff464152b06690'/>
<id>503acdb32ca84102d07cd1142eff464152b06690</id>
<content type='text'>
When a blocking lock is requested, lock request is succeeded even when
ec-&gt;fragment number of locks are acquired successfully in non-blocking locking
phase. This will lead to fop succeeding only on the bricks where the locks are
acquired, leading to the necessity of self-heals. To prevent these un-necessary
self-heals, if the remaining locks fail with EAGAIN in non-blocking lock phase
try blocking locking phase instead.

Change-Id: I940969e39acc620ccde2a876546cea77f7e130b6
BUG: 1221145
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10770
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a blocking lock is requested, lock request is succeeded even when
ec-&gt;fragment number of locks are acquired successfully in non-blocking locking
phase. This will lead to fop succeeding only on the bricks where the locks are
acquired, leading to the necessity of self-heals. To prevent these un-necessary
self-heals, if the remaining locks fail with EAGAIN in non-blocking lock phase
try blocking locking phase instead.

Change-Id: I940969e39acc620ccde2a876546cea77f7e130b6
BUG: 1221145
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10770
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: Fix failures with missing files</title>
<updated>2015-05-10T00:29:46+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2015-01-07T11:29:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=b46e65db722c14985db62a1679e0388d217b713b'/>
<id>b46e65db722c14985db62a1679e0388d217b713b</id>
<content type='text'>
When a file does not exist on a brick but it does on others, there
could be problems trying to access it because there was some loc_t
structures with null 'pargfid' but 'name' was set. This forced
inode resolution based on &lt;pargfid&gt;/name instead of &lt;gfid&gt; which
would be the correct one. To solve this problem, 'name' is always
set to NULL when 'pargfid' is not present.

Another problem was caused by an incorrect management of errors
while doing incremental locking. The only allowed error during an
incremental locking was ENOTCONN, but missing files on a brick can
be returned as ESTALE. This caused an EIO on the operation.

This patch doesn't care of errors during an incremental locking. At
the end of the operation it will check if there are enough successfully
locked bricks to continue or not.

Change-Id: I9360ebf8d819d219cea2d173c09bd37679a6f15a
BUG: 1176062
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9407
Tested-by: NetBSD Build System
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a file does not exist on a brick but it does on others, there
could be problems trying to access it because there was some loc_t
structures with null 'pargfid' but 'name' was set. This forced
inode resolution based on &lt;pargfid&gt;/name instead of &lt;gfid&gt; which
would be the correct one. To solve this problem, 'name' is always
set to NULL when 'pargfid' is not present.

Another problem was caused by an incorrect management of errors
while doing incremental locking. The only allowed error during an
incremental locking was ENOTCONN, but missing files on a brick can
be returned as ESTALE. This caused an EIO on the operation.

This patch doesn't care of errors during an incremental locking. At
the end of the operation it will check if there are enough successfully
locked bricks to continue or not.

Change-Id: I9360ebf8d819d219cea2d173c09bd37679a6f15a
BUG: 1176062
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9407
Tested-by: NetBSD Build System
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: Fix return errors when not enough bricks</title>
<updated>2014-12-05T11:39:07+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2014-11-11T17:45:01+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=36236eecef55c710e1f11ba4a04fe01da67cab6a'/>
<id>36236eecef55c710e1f11ba4a04fe01da67cab6a</id>
<content type='text'>
Changes introduced by this patch:

* Fix an incorrect error propagation when the state of the life
  cycle of a fop returns an error.

* Fix incorrect unlocking of failed locks.

* Return ENOTCONN if there aren't enough bricks online.

* In readdir(p) check that the fd has been successfully open by
  a previous opendir.

Change-Id: Ib44f25a1297849ebcbab839332f3b6359f275ebe
BUG: 1162805
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9098
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Changes introduced by this patch:

* Fix an incorrect error propagation when the state of the life
  cycle of a fop returns an error.

* Fix incorrect unlocking of failed locks.

* Return ENOTCONN if there aren't enough bricks online.

* In readdir(p) check that the fd has been successfully open by
  a previous opendir.

Change-Id: Ib44f25a1297849ebcbab839332f3b6359f275ebe
BUG: 1162805
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9098
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: Change license</title>
<updated>2014-12-03T18:45:26+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2014-11-26T11:17:08+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1a5a0e11f2098961eccb6e312be123b0061b6eb1'/>
<id>1a5a0e11f2098961eccb6e312be123b0061b6eb1</id>
<content type='text'>
Change-Id: Iae90ade2421898417b53dec0417a610cf306c44b
BUG: 1168167
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9201
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Iae90ade2421898417b53dec0417a610cf306c44b
BUG: 1168167
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
Reviewed-on: http://review.gluster.org/9201
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
