diff options
author | Pranith Kumar K <pkarampu@redhat.com> | 2016-10-19 15:50:50 +0530 |
---|---|---|
committer | Raghavendra G <rgowdapp@redhat.com> | 2016-10-24 23:42:20 -0700 |
commit | a6b63e11b7758cf1bfcb67985e25ec02845f0995 (patch) | |
tree | a091bd67a68903ebeb5fdc67062f7610b6c9eb4d /rpc/rpc-lib | |
parent | bca6d0ba54d12d389cfb5c2b37fb8cc12a7e044b (diff) |
rpc: Fix the race between notification and reconnection
Problem:
There was a hang because unlock on an entry failed with
ENOTCONN.
Client thinks the connection is down where as server thinks
the connection is up.
This is the race we are seeing:
1) Connection from client to the brick disconnects.
2) Saved frames unwind is called which unwinds all
frames that were wound before disconnect.
3) connection from client to the brick happens and
setvolume.
4) Disconnect notification for the connection in 1)
comes now and calls client_rpc_notify() which
marks the connection to be offline even when the
connection is up.
This is happening because I/O can retrigger connection
before disconnect notification is sent to the higher
layers in rpc.
Fix:
Notify the higher layers that a disconnect happened and then
go ahead with reconnect logic.
For the logs which point to the information above check:
https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1
Thanks to Raghavendra G for suggesting the correct fix.
BUG: 1386626
Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15681
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Diffstat (limited to 'rpc/rpc-lib')
-rw-r--r-- | rpc/rpc-lib/src/rpc-clnt.c | 7 |
1 files changed, 4 insertions, 3 deletions
diff --git a/rpc/rpc-lib/src/rpc-clnt.c b/rpc/rpc-lib/src/rpc-clnt.c index e8a8ea2ecd9..3caab985cfe 100644 --- a/rpc/rpc-lib/src/rpc-clnt.c +++ b/rpc/rpc-lib/src/rpc-clnt.c @@ -898,6 +898,10 @@ rpc_clnt_notify (rpc_transport_t *trans, void *mydata, switch (event) { case RPC_TRANSPORT_DISCONNECT: { + if (clnt->notifyfn) + ret = clnt->notifyfn (clnt, clnt->mydata, + RPC_CLNT_DISCONNECT, NULL); + rpc_clnt_connection_cleanup (conn); pthread_mutex_lock (&conn->lock); @@ -921,9 +925,6 @@ rpc_clnt_notify (rpc_transport_t *trans, void *mydata, } pthread_mutex_unlock (&conn->lock); - if (clnt->notifyfn) - ret = clnt->notifyfn (clnt, clnt->mydata, - RPC_CLNT_DISCONNECT, NULL); if (unref_clnt) rpc_clnt_ref (clnt); |