From da9deb54df91dedc51ebe165f3a0be646455cb5b Mon Sep 17 00:00:00 2001
From: Atin Mukherjee <amukherj@redhat.com>
Date: Fri, 12 Dec 2014 07:21:19 +0530
Subject: glusterd: Maintain  per transaction xaction_peers list in syncop &
 mgmt_v3

In current implementation xaction_peers list is maintained in a global variable
(glustrd_priv_t) for syncop/mgmt_v3. This means consistency and atomicity of
peerinfo list across transactions is not guranteed when multiple syncop/mgmt_v3
transaction are going through.

We had got into a problem in mgmt_v3-locks.t which was failing spuriously, the
reason for that was two volume set operations (in two different volume) was
going through simultaneouly and both of these transaction were manipulating the
same xaction_peers structure which lead to a corrupted list. Because of which in
some cases unlock request to peer was never triggered and we end up with having
stale locks.

Solution is to maintain a per transaction local xaction_peers list for every
syncop.

Please note I've identified this problem in op-sm area as well and a separate
patch will be attempted to fix it.

Finally thanks to Krishnan Parthasarathi and Kaushal M for your constant help to
get to the root cause.

Change-Id: Ib1eaac9e5c8fc319f4e7f8d2ad965bc1357a7c63
BUG: 1173414
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/9269
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
---
 .../bugs/bug-1173414-mgmt-v3-remote-lock-failure.t | 34 ++++++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100755 tests/bugs/bug-1173414-mgmt-v3-remote-lock-failure.t

(limited to 'tests')

diff --git a/tests/bugs/bug-1173414-mgmt-v3-remote-lock-failure.t b/tests/bugs/bug-1173414-mgmt-v3-remote-lock-failure.t
new file mode 100755
index 00000000000..adc3fe30dd4
--- /dev/null
+++ b/tests/bugs/bug-1173414-mgmt-v3-remote-lock-failure.t
@@ -0,0 +1,34 @@
+#!/bin/bash
+
+. $(dirname $0)/../include.rc
+. $(dirname $0)/../cluster.rc
+
+function check_peers {
+        $CLI_1 peer status | grep 'Peer in Cluster (Connected)' | wc -l
+}
+
+cleanup;
+
+TEST launch_cluster 2;
+TEST $CLI_1 peer probe $H2;
+
+EXPECT_WITHIN $PROBE_TIMEOUT 1 check_peers
+
+TEST $CLI_1 volume create $V0 $H1:$B1/$V0
+TEST $CLI_1 volume create $V1 $H1:$B1/$V1
+TEST $CLI_1 volume start $V0
+TEST $CLI_1 volume start $V1
+
+for i in {1..20}
+do
+    $CLI_1 volume set $V0 diagnostics.client-log-level DEBUG &
+    $CLI_1 volume set $V1 barrier on
+    $CLI_2 volume set $V0 diagnostics.client-log-level DEBUG &
+    $CLI_2 volume set $V1 barrier on
+done
+
+EXPECT_WITHIN $PROBE_TIMEOUT 1 check_peers
+TEST $CLI_1 volume status
+TEST $CLI_2 volume status
+
+cleanup;
-- 
cgit