summaryrefslogtreecommitdiffstats
path: root/api
diff options
context:
space:
mode:
authorPoornima G <pgurusid@redhat.com>2016-06-06 06:29:40 -0400
committerKaushal M <kaushal@redhat.com>2016-08-24 21:38:22 -0700
commit9cd5066226770cf3c06a21757b963d315b8fe32b (patch)
tree08d4507baae5258bf88cade81d0a0a2ee63a80aa /api
parenta97ccc632fc425156e24b7c71de97248de62b2de (diff)
gfapi: Fix IO error caused when there is consecutive graph switches
Issue: Consider a simple situation, where glfs_init() is done, i.e. initial graph is up. Now perform 2 volume sets that results in 2 client side graph changes. After this perform some IO, the IO fails with ENOTCON. The only way to recover this client is i guess another graph switch or restart. What actually is happening from code perspective: Initial graph lets say A, followed by 2 consecutive graph switches to B and C without any IO those two switches. - graph_setup (A) as a result of GF_EVENT_CHILD_UP, and fs->next_subvol = A - glfs_init() results in fs->active_subvol = A, fs->next_subvol = NULL - graph_setup (B) as a result of GF_EVENT_CHILD_UP, and fs->next_subvol = B - graph_setup (C) as a result of GF_EVENT_CHILD_UP, and fs->next_subvol = C. It also sees that the previous graph B was never set as fs->active_subvol, i.e. no IO or anything happened on B, so can safely send GF_EVENT_PARENT_DOWN (by calling glfs_subvol_done(B)). This parent down on B, results in child_down(B), which is fine. But child_down also triggers graph_setup(B). - graph_setup(B) as a result of GF_EVENT_CHILD_DOWN, and fs->next_subvol = B, and GF_EVENT_PARENT_DOWN on C as explained above. This again leads to GF_EVENT_CHILD_DOWN on C. - graph_setup(C) as a result of GF_EVENT_CHILD_DOWN, and fs->next_subvol = C, and GF_EVENT_PARENT_DOWN on B as explained above. Thus both the graphs B and C are disconnected, and hence the ENOTCON Solution: Remove the call to graph_setup() when the event is GF_EVENT_CHILD_DOWN. It don't see any reason why graph_setup should be called when there is child_down. Not sure what the original reason was, to have graph_setup in child_down. git hostory shows the first patch itself had this call. > Reviewed-on: http://review.gluster.org/14656 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1367294 Change-Id: I9de86555f66cc94a05649ac863b40ed3426ffd4b Signed-off-by: Poornima G <pgurusid@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/14835 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
Diffstat (limited to 'api')
-rw-r--r--api/src/glfs-master.c1
1 files changed, 0 insertions, 1 deletions
diff --git a/api/src/glfs-master.c b/api/src/glfs-master.c
index 77e2d53abb9..b49ce2c8447 100644
--- a/api/src/glfs-master.c
+++ b/api/src/glfs-master.c
@@ -110,7 +110,6 @@ notify (xlator_t *this, int event, void *data, ...)
pthread_cond_broadcast (&fs->child_down_cond);
}
pthread_mutex_unlock (&fs->mutex);
- graph_setup (fs, graph);
glfs_init_done (fs, 1);
break;
case GF_EVENT_CHILD_CONNECTING: