diff options
author | Niels de Vos <ndevos@redhat.com> | 2017-07-10 09:45:02 +0200 |
---|---|---|
committer | Niels de Vos <ndevos@redhat.com> | 2017-07-11 13:25:31 +0000 |
commit | 9d5328e0e07353501392b33e9553300bd34a32ad (patch) | |
tree | 2f6ca08831b2495325c59cf31d0555f7ab9d8c32 /api | |
parent | 3d5ef4e2cf31f611e3cbcd865c0367bab44a9552 (diff) |
rpc/clnt: remove locks while notifying CONNECT/DISCONNECT
Locking during notify was introduced as part of commit
aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced
to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent
xlators [2]. However as part of handling DISCONNECT protocol/client
does unwind saved frames (with failure) waiting for responses. This
saved_frames_unwind can be a costly operation and hence ideally
shouldn't be included in the critical section of notifylock, as it
unnecessarily delays the reconnection to same brick. Also, its not a
good practise to pass control to other xlators holding a lock as it
can lead to deadlocks. So, this patch removes locking in rpc-clnt
while notifying parent xlators.
To fix [2], two changes are present in this patch:
* notify DISCONNECT before cleaning up rpc connection (same as commit
a6b63e11b7758cf1bfcb6798, patch [3]).
* protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc
connection and does a start while handling a DISCONNECT event from
rpc. Note that patch [3] was reverted as rpc_clnt_start called in
quick_reconnect path of protocol/client didn't invoke connect on
transport as the connection was not cleaned up _yet_ (as cleanup was
moved post notification in rpc-clnt). This resulted in clients never
attempting connect to bricks.
Note that one of the neater ways to fix [2] (without using locks) is
to introduce generation numbers to map CONNECT and DISCONNECTS across
epochs and ignore DISCONNECT events if they don't belong to current
epoch. However, this approach is a bit complex to implement and
requires time. So, current patch is a hacky stop-gap fix till we come
up with a more cleaner solution.
[1] http://review.gluster.org/15916
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626
[3] http://review.gluster.org/15681
Cherry picked from commit 773f32caf190af4ee48818279b6e6d3c9f2ecc79:
> Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> BUG: 1427012
> Reviewed-on: https://review.gluster.org/16784
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Milind Changire <mchangir@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418
Reported-by: Markus Stockhausen <mst@collogia.de>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
BUG: 1462447
Reviewed-on: https://review.gluster.org/17733
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Diffstat (limited to 'api')
0 files changed, 0 insertions, 0 deletions