glusterd: Aggregate tasks status in 'volume status [tasks]'

Previously, glusterd used to just send back the local status of a task in a 'volume status [tasks]' command. As the rebalance operation is distributed and asynchronus, this meant that different peers could give different status values for a rebalance or remove-brick task. With this patch, all the peers will send back the tasks status as a part of the 'volume status' commit op, and the origin peer will aggregate these to arrive at a final status for the task. The aggregation is only done for rebalance or remove-brick tasks. The replace-brick task will have the same status on all the peers (see comment in glusterd_volume_status_aggregate_tasks_status() for more information) and need not be aggregated. The rebalance process has 5 states, NOT_STARTED - rebalance process has not been started on this node STARTED - rebalance process has been started and is still running STOPPED - rebalance process was stopped by a 'rebalance/remove-brick stop' command COMPLETED - rebalance process completed successfully FAILED - rebalance process failed to complete successfully The aggregation is done using the following precedence, STARTED > FAILED > STOPPED > COMPLETED > NOT_STARTED The new changes make the 'volume status tasks' command a distributed command as we need to get the task status from all peers. The following tests were performed, - Start a remove-brick task and do a status command on a peer which doesn't have the brick being removed. The remove-brick status was given correctly as 'in progress' and 'completed', instead of 'not started' - Start a rebalance task, run the status command. The status moved to 'completed' only after rebalance completed on all nodes. Also, change the CLI xml output code for rebalance status to use the same algorithm for status aggregation. Change-Id: Ifd4aff705aa51609a612d5a9194acc73e10a82c0 BUG: 1027094 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/6230 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
author: Kaushal M <kaushal@redhat.com> 2013-10-30 18:25:39 +0530
committer: Anand Avati <avati@redhat.com> 2013-12-04 13:40:55 -0800
commit: b6c835282de500dff69e68bc4aebd3700c7388d0 (patch)
tree: c5c3ac90c35172d605e870487f1f1772838011eb /cli/src
parent: 866d079c5bfc9b278c654090a9c088fe2131db1d (diff)
1 files changed, 21 insertions, 4 deletions
diff --git a/cli/src/cli-xml-output.c b/cli/src/cli-xml-output.c
index 2927ab1e4fd..fe0969a3042 100644
--- a/cli/src/cli-xml-output.c
+++ b/cli/src/cli-xml-output.c
@@ -3248,13 +3248,30 @@ cli_xml_output_vol_rebalance_status (xmlTextWriterPtr writer, dict_t *dict,
                     overall_elapsed = elapsed;
                 }
 
+                /* Rebalance has 5 states,
+                 * NOT_STARTED, STARTED, STOPPED, COMPLETE, FAILED
+                 * The precedence used to determine the aggregate status is as
+                 * below,
+                 * STARTED > FAILED > STOPPED > COMPLETE > NOT_STARTED
+                 */
+                /* TODO: Move this to a common place utilities that both CLI and
+                 * glusterd need.
+                 * Till then if the below algorithm is changed, change it in
+                 * glusterd_volume_status_aggregate_tasks_status in
+                 * glusterd-utils.c
+                 */
+
                 if (-1 == overall_status)
                         overall_status = status_rcd;
-                else if ((GF_DEFRAG_STATUS_COMPLETE == overall_status ||
-                          status_rcd > overall_status) &&
-                         (status_rcd != GF_DEFRAG_STATUS_COMPLETE))
+                int rank[] = {
+                        [GF_DEFRAG_STATUS_STARTED] = 1,
+                        [GF_DEFRAG_STATUS_FAILED] = 2,
+                        [GF_DEFRAG_STATUS_STOPPED] = 3,
+                        [GF_DEFRAG_STATUS_COMPLETE] = 4,
+                        [GF_DEFRAG_STATUS_NOT_STARTED] = 5
+                };
+                if (rank[status_rcd] <= rank[overall_status])
                         overall_status = status_rcd;
-                XML_RET_CHECK_AND_GOTO (ret, out);
 
                 /* </node> */
                 ret = xmlTextWriterEndElement (writer);
author	Kaushal M <kaushal@redhat.com>	2013-10-30 18:25:39 +0530
committer	Anand Avati <avati@redhat.com>	2013-12-04 13:40:55 -0800
commit	b6c835282de500dff69e68bc4aebd3700c7388d0 (patch)
tree	c5c3ac90c35172d605e870487f1f1772838011eb /cli/src
parent	866d079c5bfc9b278c654090a9c088fe2131db1d (diff)