<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src/afr-transaction.c, branch v7.8</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>afr: add null check for thin-arbiter gfid.</title>
<updated>2020-08-27T20:28:19+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2020-08-19T05:44:25+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=0d63dff4be92380f9b67f2b17397a02b8980abe1'/>
<id>0d63dff4be92380f9b67f2b17397a02b8980abe1</id>
<content type='text'>
Problem:
Lookup/creation of thin-arbiter ID file happens in background during
mounting. On new volumes, if the  ID file creation is in progress, and a
FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since
the TA file's gfid is null at this point, the ASSERT checks in protocol/
client causes a crash.

Fix:
Given that we decided to do Lookup/creation of thin-arbiter in
background, fail the other AFR FOPS on TA if the ID file's gfid is null
instead of winding it down to protocol/client.

Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead
code.

Updates: #763
Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit f9b5074394e3d2f3b6728aab97230ba620879426)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Lookup/creation of thin-arbiter ID file happens in background during
mounting. On new volumes, if the  ID file creation is in progress, and a
FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since
the TA file's gfid is null at this point, the ASSERT checks in protocol/
client causes a crash.

Fix:
Given that we decided to do Lookup/creation of thin-arbiter in
background, fail the other AFR FOPS on TA if the ID file's gfid is null
instead of winding it down to protocol/client.

Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead
code.

Updates: #763
Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit f9b5074394e3d2f3b6728aab97230ba620879426)
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Delay post-op for fsync</title>
<updated>2020-08-27T20:24:44+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2020-05-29T08:54:53+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2b08fde39de3182bfa11e4a22490aa80c004ab03'/>
<id>2b08fde39de3182bfa11e4a22490aa80c004ab03</id>
<content type='text'>
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Prioritize ENOSPC over other errors</title>
<updated>2020-06-22T05:42:20+00:00</updated>
<author>
<name>karthik-us</name>
<email>ksubrahm@redhat.com</email>
</author>
<published>2020-05-21T09:48:59+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=1a91aa4dbbe096ac0eb06c2a2977b6d30e555114'/>
<id>1a91aa4dbbe096ac0eb06c2a2977b6d30e555114</id>
<content type='text'>
Problem:
In a replicate/arbiter volume if file creations or writes fails on
quorum number of bricks and on one brick it is due to ENOSPC and
on other brick it fails for a different reason, it may fail with
errors other than ENOSPC in some cases.

Fix:
Prioritize ENOSPC over other lesser priority errors and do not set
op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any
error_no which can be misinterpreted by __afr_dir_write_finalize().

Also removing the function afr_has_arbiter_fop_cbk_quorum() which
might consider a successful reply form a single brick as quorum
success in some cases, whereas we always need fop to be successful
on quorum number of bricks in arbiter configuration.

Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08
Fixes: #1254
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
(cherry picked from commit fa63b45ca5edf172b1b89b28b5db3c5129cc57b6)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In a replicate/arbiter volume if file creations or writes fails on
quorum number of bricks and on one brick it is due to ENOSPC and
on other brick it fails for a different reason, it may fail with
errors other than ENOSPC in some cases.

Fix:
Prioritize ENOSPC over other lesser priority errors and do not set
op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any
error_no which can be misinterpreted by __afr_dir_write_finalize().

Also removing the function afr_has_arbiter_fop_cbk_quorum() which
might consider a successful reply form a single brick as quorum
success in some cases, whereas we always need fop to be successful
on quorum number of bricks in arbiter configuration.

Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08
Fixes: #1254
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
(cherry picked from commit fa63b45ca5edf172b1b89b28b5db3c5129cc57b6)
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: thin-arbiter lock release fixes</title>
<updated>2019-05-10T12:11:22+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2019-04-09T04:14:33+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=9ab2747da78061882f6734df4b265bce11adaef1'/>
<id>9ab2747da78061882f6734df4b265bce11adaef1</id>
<content type='text'>
- pass fop state instead of afr local to
afr_ta_dom_lock_check_and_release()

- avoid afr_lock_release_synctask() being called simultaneosuly from
notify code path and transaction (post-op) code path due to races.

- Check if the post-op on TA is valid based on event_gen checks.

- Invalidate in-memory information when we get TA child down.

Note: Thi patch addresses some pending review comments of commit
053b1309dc8fbc05fcde5223e734da9f694cf5cc
(https://review.gluster.org/#/c/glusterfs/+/20095/)

fixes: bz#1698449
Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- pass fop state instead of afr local to
afr_ta_dom_lock_check_and_release()

- avoid afr_lock_release_synctask() being called simultaneosuly from
notify code path and transaction (post-op) code path due to races.

- Check if the post-op on TA is valid based on event_gen checks.

- Invalidate in-memory information when we get TA child down.

Note: Thi patch addresses some pending review comments of commit
053b1309dc8fbc05fcde5223e734da9f694cf5cc
(https://review.gluster.org/#/c/glusterfs/+/20095/)

fixes: bz#1698449
Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr : fix Coverity CID 1398627</title>
<updated>2019-05-07T05:36:08+00:00</updated>
<author>
<name>Rinku Kothiya</name>
<email>rkothiya@redhat.com</email>
</author>
<published>2019-04-23T15:55:22+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=bbb87ee8a433949eea8b3a2f0bef152ead37ccd7'/>
<id>bbb87ee8a433949eea8b3a2f0bef152ead37ccd7</id>
<content type='text'>
Fixed coverity error, "Unchecked return value (CHECKED_RETURN)".
Checking return value &amp; logging error message if afr_set_pending_dict
fails.

updates: bz#789278

Change-Id: Iab7da6b4f3cd0622b95b8e1c412b007a330467e5
Signed-off-by: Rinku Kothiya &lt;rkothiya@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixed coverity error, "Unchecked return value (CHECKED_RETURN)".
Checking return value &amp; logging error message if afr_set_pending_dict
fails.

updates: bz#789278

Change-Id: Iab7da6b4f3cd0622b95b8e1c412b007a330467e5
Signed-off-by: Rinku Kothiya &lt;rkothiya@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Remove local from owners_list on failure of lock-acquisition</title>
<updated>2019-04-15T06:02:22+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-04-04T10:01:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=8239efc76b56f1f4ce16bab263f7b355d8205820'/>
<id>8239efc76b56f1f4ce16bab263f7b355d8205820</id>
<content type='text'>
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.

fixes bz#1696599
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.

fixes bz#1696599
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr : TA: Return actual error code in case of failure</title>
<updated>2019-03-14T12:11:20+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2019-03-08T05:12:12+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=63159cdb5374f458d7d2bffec24d4720ffc96d6c'/>
<id>63159cdb5374f458d7d2bffec24d4720ffc96d6c</id>
<content type='text'>
In afr_ta_post_op_do, we were sending EIO for every failure.
However, the original error code should be sent.

Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a
updates: bz#1686711
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In afr_ta_post_op_do, we were sending EIO for every failure.
However, the original error code should be sent.

Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a
updates: bz#1686711
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: mark changelog_fsync as internal</title>
<updated>2019-03-05T09:22:46+00:00</updated>
<author>
<name>Soumya Koduri</name>
<email>skoduri@redhat.com</email>
</author>
<published>2019-02-27T17:14:10+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=82bedd01fd36d0b631367fab3e2cc4ca0b259378'/>
<id>82bedd01fd36d0b631367fab3e2cc4ca0b259378</id>
<content type='text'>
As afr_changelog_fsync is used for internal operations, use
GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating
it as conflicting fop and recall lease.

Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6
updates: bz#1648768
Signed-off-by: Soumya Koduri &lt;skoduri@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As afr_changelog_fsync is used for internal operations, use
GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating
it as conflicting fop and recall lease.

Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6
updates: bz#1648768
Signed-off-by: Soumya Koduri &lt;skoduri@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/thin-arbiter: Consider thin-arbiter before marking new entry changelog</title>
<updated>2019-02-01T05:44:36+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-12-21T09:01:15+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6b98735956c599ea621fa560b201fb7de6c36cac'/>
<id>6b98735956c599ea621fa560b201fb7de6c36cac</id>
<content type='text'>
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.

As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.

Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.

As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.

Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Refactor internal locking code to allow multiple inodelks</title>
<updated>2018-12-28T10:46:00+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2018-12-18T09:08:22+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=a12cadc1377ef51ad52defd1da91bf8f599e5786'/>
<id>a12cadc1377ef51ad52defd1da91bf8f599e5786</id>
<content type='text'>
For implementing copy_file_range fop, AFR needs to perform two inodelks in the
same transaction. This patch brings in the necessary structure to make it
easier to do so.

Entry-locks in AFR were already taking multiple entry-locks on different inodes
with the respective basenames. This patch extends the logic in inodelks to use
the same lockee_t structure. This lead to removal of quite a lot of duplicate
code present in afr-lk-common.c as both the locks are doing same things except
'winding' part.

updates: #536
Change-Id: Ibfce7e3f260bb27b18645152ec680c33866fe0ae
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For implementing copy_file_range fop, AFR needs to perform two inodelks in the
same transaction. This patch brings in the necessary structure to make it
easier to do so.

Entry-locks in AFR were already taking multiple entry-locks on different inodes
with the respective basenames. This patch extends the logic in inodelks to use
the same lockee_t structure. This lead to removal of quite a lot of duplicate
code present in afr-lk-common.c as both the locks are doing same things except
'winding' part.

updates: #536
Change-Id: Ibfce7e3f260bb27b18645152ec680c33866fe0ae
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
