<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests, branch exp</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>tests: Fix spurious failure in bug-1402841.t-mt-dir-scan-race.t</title>
<updated>2016-12-19T09:31:52+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-12-16T16:11:58+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2bb2313656a19ee8efe3cfeda4f2ae90e8e49b62'/>
<id>2bb2313656a19ee8efe3cfeda4f2ae90e8e49b62</id>
<content type='text'>
Check that shd is up before executing 'volume heal' command

Change-Id: Ib510a5de06d732fd3a738e90fa16376698479897
BUG: 1405554
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16169
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check that shd is up before executing 'volume heal' command

Change-Id: Ib510a5de06d732fd3a738e90fa16376698479897
BUG: 1405554
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16169
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Move tests/basic/gfapi/bug1291259.t to bad tests list</title>
<updated>2016-12-19T09:30:50+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-12-16T07:07:41+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=5542b404f4a6b5968d52175ea10d694aabf906f1'/>
<id>5542b404f4a6b5968d52175ea10d694aabf906f1</id>
<content type='text'>
See thread
http://www.gluster.org/pipermail/gluster-devel/2016-December/051714.html
for more information.

Change-Id: I9abe4b0e40499e53c1276a10a6bc192fd0f2cef7
BUG: 1405301
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16159
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
See thread
http://www.gluster.org/pipermail/gluster-devel/2016-December/051714.html
for more information.

Change-Id: I9abe4b0e40499e53c1276a10a6bc192fd0f2cef7
BUG: 1405301
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16159
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Fix spurious test failure in bug-1316437.t</title>
<updated>2016-12-16T14:36:46+00:00</updated>
<author>
<name>Rajesh Joseph</name>
<email>rjoseph@redhat.com</email>
</author>
<published>2016-12-15T15:21:30+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e9d8525a0d34130ba2a582109937b8e79eecf6ab'/>
<id>e9d8525a0d34130ba2a582109937b8e79eecf6ab</id>
<content type='text'>
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.

BUG: 1404573
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16162
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.

BUG: 1404573
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16162
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glfsheal: Explicitly enable self-heal xlator options</title>
<updated>2016-12-15T15:46:26+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-12-14T17:18:20+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=209c2d447be874047cb98d86492b03fa807d1832'/>
<id>209c2d447be874047cb98d86492b03fa807d1832</id>
<content type='text'>
Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
can heal split-brain files even if they are disabled on the volume via volume
set commands.

Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
BUG: 1234054
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11333
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
can heal split-brain files even if they are disabled on the volume via volume
set commands.

Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
BUG: 1234054
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11333
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>access_control : address O_TRUNC and O_APPEND flag properly in posix_acl_open</title>
<updated>2016-12-14T13:01:07+00:00</updated>
<author>
<name>Jiffin Tony Thottan</name>
<email>jthottan@redhat.com</email>
</author>
<published>2016-11-17T12:52:39+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e81fd0b85c8dd3f521e54e32b7da2f99a513f2f2'/>
<id>e81fd0b85c8dd3f521e54e32b7da2f99a513f2f2</id>
<content type='text'>
In posix_acl_open, in switch value passed is (flag &amp; O_ACCMODE). The value for
O_ACCMODE is 0003, so the result will always be less than or equal to 3.
But value for O_TRUNC is 01000 and O_APPEND is 02000, so it is not right to
check it in switch case

Change-Id: Ia17db80a6a5f681c35e08e062d384f33ef7e0354
BUG: 1387241
Signed-off-by: Jiffin Tony Thottan &lt;jthottan@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15688
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In posix_acl_open, in switch value passed is (flag &amp; O_ACCMODE). The value for
O_ACCMODE is 0003, so the result will always be less than or equal to 3.
But value for O_TRUNC is 01000 and O_APPEND is 02000, so it is not right to
check it in switch case

Change-Id: Ia17db80a6a5f681c35e08e062d384f33ef7e0354
BUG: 1387241
Signed-off-by: Jiffin Tony Thottan &lt;jthottan@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15688
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fix per-txn optimistic changelog initialisation</title>
<updated>2016-12-12T16:38:49+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-12-08T17:19:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2b76520ca3e41cbac8f9318dce87e0b8d670c0ee'/>
<id>2b76520ca3e41cbac8f9318dce87e0b8d670c0ee</id>
<content type='text'>
Incorrect initialisation of local-&gt;optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.

FIX:
Initialise local-&gt;optimistic_change_log just before pre-op.

Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.

Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d
BUG: 1402730
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16075
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Incorrect initialisation of local-&gt;optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.

FIX:
Initialise local-&gt;optimistic_change_log just before pre-op.

Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.

Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d
BUG: 1402730
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16075
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncop: fix conditional wait bug in parallel dir scan</title>
<updated>2016-12-09T10:24:21+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-12-09T04:20:43+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=2d012c4558046afd6adb3992ff88f937c5f835e4'/>
<id>2d012c4558046afd6adb3992ff88f937c5f835e4</id>
<content type='text'>
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.

Fix: Made some changes to _dir_scan_job_fn

- If a worker encounters error while processing an entry, notify the
  readdir loop in syncop_mt_dir_scan() of the error but continue to process
  other entries in the queue, decrementing the qlen as and when we dequeue
  elements, and ending only when the queue is empty.

- If the readdir loop in syncop_mt_dir_scan() gets an error form the
  worker, stop the readdir+queueing of further entries.

Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1402841
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16073
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.

Fix: Made some changes to _dir_scan_job_fn

- If a worker encounters error while processing an entry, notify the
  readdir loop in syncop_mt_dir_scan() of the error but continue to process
  other entries in the queue, decrementing the qlen as and when we dequeue
  elements, and ending only when the queue is empty.

- If the readdir loop in syncop_mt_dir_scan() gets an error form the
  worker, stop the readdir+queueing of further entries.

Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1402841
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16073
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>uss: snapd should enable SSL if SSL is enabled on volume</title>
<updated>2016-12-01T09:27:07+00:00</updated>
<author>
<name>Rajesh Joseph</name>
<email>rjoseph@redhat.com</email>
</author>
<published>2016-11-29T16:27:37+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=182f0d12040dab5081ca645a3f370f65cd68b528'/>
<id>182f0d12040dab5081ca645a3f370f65cd68b528</id>
<content type='text'>
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.

Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
BUG: 1400013
Reviewed-on: http://review.gluster.org/15979
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kaushal M &lt;kaushal@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.

Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
BUG: 1400013
Reviewed-on: http://review.gluster.org/15979
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kaushal M &lt;kaushal@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libglusterfs: Fix a read hang</title>
<updated>2016-11-29T03:59:08+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-21T14:27:08+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=8943c19a2ef51b6e4fa66cb57211d469fe558579'/>
<id>8943c19a2ef51b6e4fa66cb57211d469fe558579</id>
<content type='text'>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}

As a part of http://review.gluster.org/15901/ the caller io-cache
was fixed.

BUG: 1388292
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15923
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}

As a part of http://review.gluster.org/15901/ the caller io-cache
was fixed.

BUG: 1388292
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15923
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: CLI for granular entry heal enablement/disablement</title>
<updated>2016-11-28T11:56:33+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-09-22T11:18:54+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=6dfc90fcd36956dcc4f624b3912bfb8e9c95757f'/>
<id>6dfc90fcd36956dcc4f624b3912bfb8e9c95757f</id>
<content type='text'>
When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.

To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.

New CLI: gluster volume heal &lt;VOL&gt; granular-entry-heal {enable,disable}

Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099
BUG: 1370410
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15747
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.

To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.

New CLI: gluster volume heal &lt;VOL&gt; granular-entry-heal {enable,disable}

Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099
BUG: 1370410
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15747
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
