diff options
author | Milind Changire <mchangir@redhat.com> | 2017-05-03 10:51:16 +0530 |
---|---|---|
committer | Jeff Darcy <jeff@pl.atyp.us> | 2017-05-03 16:29:49 +0000 |
commit | 4f7ef3020edcc75cdeb22d8da8a1484f9db77ac9 (patch) | |
tree | afb915c69dd457143574669b989dcd0bee70fe62 /xlators/protocol | |
parent | 6484558c7502e5afe1c96081dbe329ca5d9cb7e2 (diff) |
rpc: fix transport add/remove race on port probing
Problem:
Spurious __gf_free() assertion failures seen all over the place with
header->magic being overwritten when running port probing tests with
'nmap'
Solution:
Fix sequence of:
1. add accept()ed socket connection fd to epoll set
2. add newly created rpc_transport_t object in RPCSVC service list
Correct sequence is #2 followed by #1.
Reason:
Adding new fd returned by accept() to epoll set causes an epoll_wait()
to return immediately with a POLLIN event. This races ahead to a readv()
which returms with errno:104 (Connection reset by peer) during port
probing using 'nmap'. The error is then handled by POLLERR code to
remove the new transport object from RPCSVC service list and later
unref and destroy the rpc transport object.
socket_server_event_handler() then catches up with registering the
unref'd/destroyed rpc transport object. This is later manifest as
assertion failures in __gf_free() with the header->magic field botched
due to invalid address references.
All this does not result in a Segmentation Fault since the address
space continues to be mapped into the process and pages still being
referenced elsewhere.
As a further note:
This race happens only in accept() codepath. Only in this codepath,
the notify will be referring to two transports:
1, listener transport and
2. newly accepted transport
All other notify refer to only one transport i.e., the transport/socket
on which the event is received. Since epoll is ONE_SHOT another event won't
arrive on the same socket till the current event is processed. However, in
the accept() codepath, the current event - ACCEPT - and the new event -
POLLIN/POLLER - arrive on two different sockets:
1. ACCEPT on listener socket and
2. POLLIN/POLLERR on newly registered socket.
Also, note that these two events are handled different thread contexts.
Cleanup:
Critical section in socket_server_event_handler() has been removed.
Instead, an additional ref on new_trans has been used to avoid ref/unref
race when notifying RPCSVC.
Change-Id: I4417924bc9e6277d24bd1a1c5bcb7445bcb226a3
BUG: 1438966
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: https://review.gluster.org/17139
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Diffstat (limited to 'xlators/protocol')
0 files changed, 0 insertions, 0 deletions