summaryrefslogtreecommitdiffstats
path: root/doc/developer-guide/fuse-interrupt.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/developer-guide/fuse-interrupt.md')
-rw-r--r--doc/developer-guide/fuse-interrupt.md211
1 files changed, 211 insertions, 0 deletions
diff --git a/doc/developer-guide/fuse-interrupt.md b/doc/developer-guide/fuse-interrupt.md
new file mode 100644
index 00000000000..ec991b81ec5
--- /dev/null
+++ b/doc/developer-guide/fuse-interrupt.md
@@ -0,0 +1,211 @@
+# Fuse interrupt handling
+
+## Conventions followed
+
+- *FUSE* refers to the "wire protocol" between kernel and userspace and
+ related specifications.
+- *fuse* refers to the kernel subsystem and also to the GlusterFs translator.
+
+## FUSE interrupt handling spec
+
+The [Linux kernel FUSE documentation](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt?h=v4.18#n148)
+desrcibes how interrupt handling happens in fuse.
+
+## Interrupt handling in the fuse translator
+
+### Declarations
+
+This document describes the internal API in the fuse translator with which
+interrupt can be handled.
+
+The API being internal (to be used only in fuse-bridge.c; the functions are
+not exported to a header file).
+
+```
+enum fuse_interrupt_state {
+ /* ... */
+ INTERRUPT_SQUELCHED,
+ INTERRUPT_HANDLED,
+ /* ... */
+};
+typedef enum fuse_interrupt_state fuse_interrupt_state_t;
+struct fuse_interrupt_record;
+typedef struct fuse_interrupt_record fuse_interrupt_record_t;
+typedef void (*fuse_interrupt_handler_t)(xlator_t *this,
+ fuse_interrupt_record_t *);
+struct fuse_interrupt_record {
+ fuse_in_header_t fuse_in_header;
+ void *data;
+ /*
+ ...
+ */
+};
+
+fuse_interrupt_record_t *
+fuse_interrupt_record_new(fuse_in_header_t *finh,
+ fuse_interrupt_handler_t handler);
+
+void
+fuse_interrupt_record_insert(xlator_t *this, fuse_interrupt_record_t *fir);
+
+gf_boolean_t
+fuse_interrupt_finish_fop(call_frame_t *frame, xlator_t *this,
+ gf_boolean_t sync, void **datap);
+
+void
+fuse_interrupt_finish_interrupt(xlator_t *this, fuse_interrupt_record_t *fir,
+ fuse_interrupt_state_t intstat,
+ gf_boolean_t sync, void **datap);
+```
+
+The code demonstrates the usage of the API through `fuse_flush()`. (It's a
+dummy implementation only for demonstration purposes.) Flush is chosen
+because a `FLUSH` interrupt is easy to trigger (see
+*tests/features/interrupt.t*). Interrupt handling for flush is switched on
+by `--fuse-flush-handle-interrupt` (a hidden glusterfs command line flag).
+The implementation of flush interrupt is contained in the
+`fuse_flush_interrupt_handler()` function and blocks guarded by the
+
+```
+if (priv->flush_handle_interrupt) { ...
+```
+
+conditional (where `priv` is a `*fuse_private_t`).
+
+### Overview
+
+"Regular" fuse fops and interrupt handlers interact via a list containing
+interrupt records.
+
+If a fop wishes to have its interrupts handled, it needs to set up an
+interrupt record and insert it into the list; also when it's to finish
+(ie. in its "cbk" stage) it needs to delete the record from the list.
+
+If no interrupt happens, basically that's all to it - a list insertion
+and deletion.
+
+However, if an interrupt comes for the fop, the interrupt FUSE request
+will carry the data identifying an ongoing fop (that is, its `unique`),
+and based on that, the interrupt record will be looked up in the list, and
+the specific interrupt handler (a member of the interrupt record) will be
+called.
+
+Usually the fop needs to share some data with the interrupt handler to
+enable it to perform its task (also shared via the interrupt record).
+The interrupt API offers two approaches to manage shared data:
+- _Async or reference-counting strategy_: from the point on when the interrupt
+ record is inserted to the list, it's owned jointly by the regular fop and
+ the prospective interrupt handler. Both of them need to check before they
+ return if the other is still holding a reference; if not, then they are
+ responsible for reclaiming the shared data.
+- _Sync or borrow strategy_: the interrupt handler is considered a borrower
+ of the shared data. The interrupt handler should not reclaim the shared
+ data. The fop will wait for the interrupt handler to finish (ie., the borrow
+ to be returned), then it has to reclaim the shared data.
+
+The user of the interrupt API need to call the following functions to
+instrument this control flow:
+- `fuse_interrupt_record_insert()` in the fop to insert the interrupt record to
+ the list;
+- `fuse_interrupt_finish_fop()`in the fop (cbk) and
+- `fuse_interrupt_finish_interrupt()`in the interrupt handler
+
+to perform needed synchronization at the end their tenure. The data management
+strategies are implemented by the `fuse_interrupt_finish_*()` functions (which
+have an argument to specify which strategy to use); these routines take care
+of freeing the interrupt record itself, while the reclamation of the shared data
+is left to the API user.
+
+### Usage
+
+A given FUSE fop can be enabled to handle interrupts via the following
+steps:
+
+- Define a handler function (of type `fuse_interrupt_handler_t`).
+ It should implement the interrupt handling logic and in the end
+ call (directly or as async callback) `fuse_interrupt_finish_interrupt()`.
+ The `intstat` argument to `fuse_interrupt_finish_interrupt` should be
+ either `INTERRUPT_SQUELCHED` or `INTERRUPT_HANDLED`.
+ - `INTERRUPT_SQUELCHED` means that the interrupt could not be delivered
+ and the fop is going on uninterrupted.
+ - `INTERRUPT_HANDLED` means that the interrupt was actually handled. In
+ this case the fop will be answered from interrupt context with errno
+ `EINTR` (that is, the fop should not send a response to the kernel).
+
+ (the enum `fuse_interrupt_state` includes further members, which are reserved
+ for internal use).
+
+ We return to the `sync` and `datap` arguments later.
+- In the `fuse_<FOP>` function create an interrupt record using
+ `fuse_interrupt_record_new()`, passing the incoming `fuse_in_header` and
+ the above handler function to it.
+ - Arbitrary further data can be referred to via the `data` member of the
+ interrupt record that is to be passed on from fop context to
+ interrupt context.
+- When it's set up, pass the interrupt record to
+ `fuse_interrupt_record_insert()`.
+- In `fuse_<FOP>_cbk` call `fuse_interrupt_finish_fop()`.
+ - `fuse_interrupt_finish_fop()` returns a Boolean according to whether the
+ interrupt was handled. If it was, then the FUSE request is already
+ answered and the stack gets destroyed in `fuse_interrupt_finish_fop` so
+ `fuse_<FOP>_cbk()` can just return (zero). Otherwise follow the standard
+ cbk logic (answer the FUSE request and destroy the stack -- these are
+ typically accomplished by `fuse_err_cbk()`).
+- The last two argument of `fuse_interrupt_finish_fop()` and
+ `fuse_interrupt_finish_interrupt()` are `gf_boolean_t sync` and
+ `void **datap`.
+ - `sync` represents the strategy for freeing the interrupt record. The
+ interrupt handler and the fop handler are in race to get at the interrupt
+ record first (interrupt handler for purposes of doing the interrupt
+ handling, fop handler for purposes of deactivating the interrupt record
+ upon completion of the fop handling).
+ - If `sync` is true, then the fop handler will wait for the interrupt
+ handler to finish and it takes care of freeing.
+ - If `sync` is false, the loser of the above race will perform freeing.
+
+ Freeing is done within the respective interrupt finish routines, except
+ for the `data` field of the interrupt record; with respect to that, see
+ the discussion of the `datap` parameter below. The strategy has to be
+ consensual, that is, `fuse_interrupt_finish_fop()` and
+ `fuse_interrupt_finish_interrupt()` must pass the same value for `sync`.
+ If dismantling the resources associated with the interrupt record is
+ simple, `sync = _gf_false` is the suggested choice; `sync = _gf_true` can
+ be useful in the opposite case, when dismantling those resources would
+ be inconvenient to implement in two places or to enact in non-fop context.
+ - If `datap` is `NULL`, the `data` member of the interrupt record will be
+ freed within the interrupt finish routine. If it points to a valid
+ `void *` pointer, and if caller is doing the cleanup (see `sync` above),
+ then that pointer will be directed to the `data` member of the interrupt
+ record and it's up to the caller what it's doing with it.
+ - If `sync` is true, interrupt handler can use `datap = NULL`, and
+ fop handler will have `datap` point to a valid pointer.
+ - If `sync` is false, and handlers pass a pointer to a pointer for
+ `datap`, they should check if the pointed pointer is NULL before
+ attempting to deal with the data.
+
+### FUSE answer for the interrupted fop
+
+The kernel acknowledges a successful interruption for a given FUSE request
+if the filesystem daemon answers it with errno EINTR; upon that, the syscall
+which induced the request will be abruptly terminated with an interrupt, rather
+than returning a value.
+
+In glusterfs, this can be arranged in two ways.
+
+- If the interrupt handler wins the race for the interrupt record, ie.
+ `fuse_interrupt_finish_fop()` returns true to `fuse_<FOP>_cbk()`, then, as
+ said above, `fuse_<FOP>_cbk()` does not need to answer the FUSE request.
+ That's because then the interrupt handler will take care about answering
+ it (with errno EINTR).
+- If `fuse_interrupt_finish_fop()` returns false to `fuse_<FOP>_cbk()`, then
+ this return value does not inform the fop handler whether there was an interrupt
+ or not. This return value occurs both when fop handler won the race for the
+ interrupt record against the interrupt handler, and when there was no interrupt
+ at all.
+
+ However, the internal logic of the fop handler might detect from other
+ circumstances that an interrupt was delivered. For example, the fop handler
+ might be sleeping, waiting for some data to arrive, so that a premature
+ wakeup (with no data present) occurs if the interrupt handler intervenes. In
+ such cases it's the responsibility of the fop handler to reply the FUSE
+ request with errro EINTR.