diff options
author | Sanju Rakonde <srakonde@redhat.com> | 2018-09-25 23:36:48 +0530 |
---|---|---|
committer | Atin Mukherjee <amukherj@redhat.com> | 2018-09-26 12:22:04 +0000 |
commit | f1e9b878ce2067db83a0baa5f384eda87287719d (patch) | |
tree | cf49400d80e735126bb5dac54fd395e12d4a661d /xlators | |
parent | 6b273c1644595472d17f08c891aab62cebecbcbe (diff) |
glusterd: make sure that brickinfo->uuid is not null
Problem: After an upgrade from the version where shared-brick-count
option is not present to a version which introduced this option
causes issue at the mount point i.e, size of the volume at mount
point will be reduced by shared-brick-count value times.
Cause: shared-brick-count is equal to the number of bricks that
are sharing the file system. gd_set_shared_brick_count() calculates
the shared-brick-count value based on uuid of the node and fsid of
the brick. https://review.gluster.org/#/c/glusterfs/+/19484 handles
setting of fsid properly during an upgrade path. This patch assumed
that when the code path is reached, brickinfo->uuid is non-null.
But brickinfo->uuid is null for all the bricks, as the uuid is null
https://review.gluster.org/#/c/glusterfs/+/19484 couldn't reached the
code path to set the fsid for bricks. So, we had fsid as 0 for all
bricks, which resulted in gd_set_shared_brick_count() to calculate
shared-brick-count in a wrong way. i.e, the logic written in
gd_set_shared_brick_count() didn't work as expected since fsid is 0.
Solution: Before control reaches the code path written by
https://review.gluster.org/#/c/glusterfs/+/19484,
adding a check for whether brickinfo->uuid is null and
if brickinfo->uuid is having null value, calling
glusterd_resolve_brick will set the brickinfo->uuid to a
proper value. When we have proper uuid, fsid for the bricks
will be set properly and shared-brick-count value will be
caluculated correctly.
Please take a look at the bug https://bugzilla.redhat.com/show_bug.cgi?id=1632889
for complete RCA
Steps followed to test the fix:
1. Created a 2 node cluster, the cluster is running with binary
which doesn't have shared-brick-count option
2. Created a 2x(2+1) volume and started it
3. Mouted the volume, checked size of volume using df
4. Upgrade to a version where shared-brick-count is introduced
(upgraded the nodes one by one i.e, stop the glusterd, upgrade the node
and start the glusterd).
5. after upgrading both the nodes, bumped up the cluster.op-version
6. At mount point, df shows the correct size for volume.
fixes: bz#1632889
Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
Diffstat (limited to 'xlators')
-rw-r--r-- | xlators/mgmt/glusterd/src/glusterd-store.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/xlators/mgmt/glusterd/src/glusterd-store.c b/xlators/mgmt/glusterd/src/glusterd-store.c index 2d9987971b8..f2f7d54a726 100644 --- a/xlators/mgmt/glusterd/src/glusterd-store.c +++ b/xlators/mgmt/glusterd/src/glusterd-store.c @@ -2713,6 +2713,8 @@ glusterd_store_retrieve_bricks(glusterd_volinfo_t *volinfo) * snapshot or snapshot restored volume this would be done post * creating the brick mounts */ + if (gf_uuid_is_null(brickinfo->uuid)) + (void)glusterd_resolve_brick(brickinfo); if (brickinfo->real_path[0] == '\0' && !volinfo->is_snap_volume && gf_uuid_is_null(volinfo->restored_from_snap)) { /* By now if the brick is a local brick then it will be @@ -2721,7 +2723,6 @@ glusterd_store_retrieve_bricks(glusterd_volinfo_t *volinfo) * with MY_UUID for realpath check. Hence do not handle * error */ - (void)glusterd_resolve_brick(brickinfo); if (!gf_uuid_compare(brickinfo->uuid, MY_UUID)) { if (!realpath(brickinfo->path, abspath)) { gf_msg(this->name, GF_LOG_CRITICAL, errno, |