diff options
author | Ravishankar N <ravishankar@redhat.com> | 2016-12-09 09:50:43 +0530 |
---|---|---|
committer | Pranith Kumar Karampuri <pkarampu@redhat.com> | 2016-12-09 02:24:21 -0800 |
commit | 2d012c4558046afd6adb3992ff88f937c5f835e4 (patch) | |
tree | e41cf9a6eeca0d299296472d6d2bc331f3960e00 /tests | |
parent | 64451d0f25e7cc7aafc1b6589122648281e4310a (diff) |
syncop: fix conditional wait bug in parallel dir scan
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.
Fix: Made some changes to _dir_scan_job_fn
- If a worker encounters error while processing an entry, notify the
readdir loop in syncop_mt_dir_scan() of the error but continue to process
other entries in the queue, decrementing the qlen as and when we dequeue
elements, and ending only when the queue is empty.
- If the readdir loop in syncop_mt_dir_scan() gets an error form the
worker, stop the readdir+queueing of further entries.
Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1402841
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16073
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Diffstat (limited to 'tests')
-rwxr-xr-x | tests/bugs/core/bug-1402841.t-mt-dir-scan-race.t | 31 |
1 files changed, 31 insertions, 0 deletions
diff --git a/tests/bugs/core/bug-1402841.t-mt-dir-scan-race.t b/tests/bugs/core/bug-1402841.t-mt-dir-scan-race.t new file mode 100755 index 00000000000..e31c81005bf --- /dev/null +++ b/tests/bugs/core/bug-1402841.t-mt-dir-scan-race.t @@ -0,0 +1,31 @@ +#!/bin/bash +. $(dirname $0)/../../include.rc +. $(dirname $0)/../../volume.rc +cleanup; + +TEST glusterd +TEST pidof glusterd +TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0,1} +TEST $CLI volume set $V0 self-heal-daemon off +TEST $CLI volume set $V0 cluster.shd-wait-qlength 100 +TEST $CLI volume start $V0 + +TEST glusterfs --volfile-id=$V0 --volfile-server=$H0 $M0; +touch $M0/file{1..200} + +TEST kill_brick $V0 $H0 $B0/${V0}1 +for i in {1..200}; do echo hello>$M0/file$i; done +TEST $CLI volume start $V0 force +EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status $V0 $H0 $B0/${V0}1 +EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" afr_child_up_status $V0 1 + +EXPECT "200" get_pending_heal_count $V0 +TEST $CLI volume set $V0 self-heal-daemon on +TEST $CLI volume heal $V0 +TEST $CLI volume set $V0 self-heal-daemon off +EXPECT_NOT "^0$" get_pending_heal_count $V0 +TEST $CLI volume set $V0 self-heal-daemon on +TEST $CLI volume heal $V0 +EXPECT_WITHIN $HEAL_TIMEOUT "^0$" get_pending_heal_count $V0 +TEST umount $M0 +cleanup; |