From e91bfdec983ce41a395fe0b89d2a31fb5a6564a3 Mon Sep 17 00:00:00 2001 From: Samikshan Bairagya Date: Tue, 23 May 2017 19:32:24 +0530 Subject: glusterd: Eliminate race in brick compatibility checking stage In https://review.gluster.org/17307/, while looking for compatible bricks for multiplexing, it is checked if the brick pidfile exists before checking if the corresponding brick process is running. However checking if the brick process is running just after checking if the pidfile exists isn't enough since there might be race conditions where the pidfile has been created but hasn't been updated with a pid value yet. This commit solves that by making sure that we wait iteratively till the pid value is updated as well. > Reviewed-on: https://review.gluster.org/17375 > Smoke: Gluster Build System > Reviewed-by: Atin Mukherjee > NetBSD-regression: NetBSD Build System > CentOS-regression: Gluster Build System (cherry picked from commit a8624b8b13a1f4222e4d3e33fa5836d7b45369bc) Change-Id: Ib7a158f95566486f7c1f84b6357c9b89e4c797ae BUG: 1453087 Signed-off-by: Samikshan Bairagya Reviewed-on: https://review.gluster.org/17425 NetBSD-regression: NetBSD Build System Tested-by: Raghavendra Talur CentOS-regression: Gluster Build System Smoke: Gluster Build System Reviewed-by: Raghavendra Talur --- xlators/mgmt/glusterd/src/glusterd-utils.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) (limited to 'xlators') diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c b/xlators/mgmt/glusterd/src/glusterd-utils.c index b86a8440458..720e8955b68 100644 --- a/xlators/mgmt/glusterd/src/glusterd-utils.c +++ b/xlators/mgmt/glusterd/src/glusterd-utils.c @@ -5227,13 +5227,16 @@ find_compat_brick_in_vol (glusterd_conf_t *conf, * wait for the pidfile to be populated with a value before * checking if the service is running */ while (retries > 0) { - if (sys_access (pidfile2, F_OK) == 0) + if (sys_access (pidfile2, F_OK) == 0 && + gf_is_service_running (pidfile2, &pid2)) { break; + } + sleep (1); retries--; } - if (!gf_is_service_running (pidfile2, &pid2)) { + if (retries == 0) { gf_log (this->name, GF_LOG_INFO, "cleaning up dead brick %s:%s", other_brick->hostname, other_brick->path); -- cgit