glusterd : release cluster wide locks in op-sm during failures

glusterd op-sm infrastructure has some loophole in handing error cases in locking/unlocking phases which ends up having stale locks restricting further transactions to go through. This patch still doesn't handle all possible unlocking error cases as the framework neither has retry mechanism nor the lock timeout. For eg - if unlocking fails in one of the peer, cluster wide lock is not released and further transaction can not be made until and unless originator node/the node where unlocking failed is restarted. Following test cases were executed (with the help of gdb) after applying this patch: * RPC timesout in lock cbk * Decoding of RPC response in lock cbk fails * RPC response is received from unknown peer in lock cbk * Setting peerinfo in dictionary fails while sending lock request for first peer in the list * Setting peerinfo in dictionary fails while sending lock request for other peers * Lock RPC could not be sent for peers For all above test cases the success criteria is not to have any stale locks Patch link : http://review.gluster.org/9012 Change-Id: Ia1550341c31005c7850ee1b2697161c9ca04b01a BUG: 1179136 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9012 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/9393 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
author: Atin Mukherjee <amukherj@redhat.com> 2014-10-27 12:12:03 +0530
committer: Raghavendra Bhat <raghavendra@redhat.com> 2015-03-03 23:31:08 -0800
commit: b646678334f4fab78883ecc1b993ec0cb1b49aba (patch)
tree: 206f29f1c5372732e7e2ed2116f8f35cc9c2c19b /cli
parent: a1d9f01b28267fc333aebc49cb81ee69dc2c24f8 (diff)
1 files changed, 15 insertions, 5 deletions
diff --git a/cli/src/cli-rpc-ops.c b/cli/src/cli-rpc-ops.c
index f194a76efb4..1af1ca52af1 100644
--- a/cli/src/cli-rpc-ops.c
+++ b/cli/src/cli-rpc-ops.c
@@ -1508,14 +1508,18 @@ gf_cli_defrag_volume_cbk (struct rpc_req *req, struct iovec *iov,
                 if (rsp.op_ret && strcmp (rsp.op_errstr, "")) {
                         snprintf (msg, sizeof (msg), "%s", rsp.op_errstr);
                 } else {
-                        if (!rsp.op_ret) {
+                         if (!rsp.op_ret) {
+                                /* append errstr in the cli msg for successful
+                                 * case since unlock failures can be highlighted
+                                 * event though rebalance command was successful
+                                 */
                                 snprintf (msg, sizeof (msg),
                                           "Initiated rebalance on volume %s."
                                           "\nExecute \"gluster volume rebalance"
                                           " <volume-name> status\" to check"
-                                          " status.\nID: %s", volname,
-                                          task_id_str);
-                        } else {
+                                          " status.\nID: %s\n%s", volname,
+                                          task_id_str, rsp.op_errstr);
+                         } else {
                                 snprintf (msg, sizeof (msg),
                                           "Starting rebalance on volume %s has "
                                           "been unsuccessful.", volname);
@@ -1535,13 +1539,17 @@ gf_cli_defrag_volume_cbk (struct rpc_req *req, struct iovec *iov,
                                           volname);
                         goto done;
                 } else {
+                        /* append errstr in the cli msg for successful case
+                         * since unlock failures can be highlighted event though
+                         * rebalance command was successful */
                         snprintf (msg, sizeof (msg),
                                   "rebalance process may be in the middle of a "
                                   "file migration.\nThe process will be fully "
                                   "stopped once the migration of the file is "
                                   "complete.\nPlease check rebalance process "
                                   "for completion before doing any further "
-                                  "brick related tasks on the volume.");
+                                  "brick related tasks on the volume.\n%s",
+                                  rsp.op_errstr);
                 }
         }
         if (cmd == GF_DEFRAG_CMD_STATUS) {
@@ -1554,6 +1562,8 @@ gf_cli_defrag_volume_cbk (struct rpc_req *req, struct iovec *iov,
                                           "Failed to get the status of "
                                           "rebalance process");
                         goto done;
+                } else {
+                        snprintf (msg, sizeof (msg), "%s", rsp.op_errstr);
                 }
         }
author	Atin Mukherjee <amukherj@redhat.com>	2014-10-27 12:12:03 +0530
committer	Raghavendra Bhat <raghavendra@redhat.com>	2015-03-03 23:31:08 -0800
commit	b646678334f4fab78883ecc1b993ec0cb1b49aba (patch)
tree	206f29f1c5372732e7e2ed2116f8f35cc9c2c19b /cli
parent	a1d9f01b28267fc333aebc49cb81ee69dc2c24f8 (diff)