From 23b48ac03b60417fbc2d93399a91a57b850f8fbd Mon Sep 17 00:00:00 2001 From: Niels de Vos Date: Thu, 5 May 2016 17:30:26 +0200 Subject: Move "Mandatory Locks" for 3.8 to "done" Change-Id: Ifd0695dd146f0ceb99e4978945b3156d6f13d17a Signed-off-by: Niels de Vos Reviewed-on: http://review.gluster.org/14227 Reviewed-by: Anoop C S Reviewed-by: Poornima G --- accepted/Mandatory Locks.md | 212 -------------------------------------------- 1 file changed, 212 deletions(-) delete mode 100644 accepted/Mandatory Locks.md (limited to 'accepted') diff --git a/accepted/Mandatory Locks.md b/accepted/Mandatory Locks.md deleted file mode 100644 index c8b09a5..0000000 --- a/accepted/Mandatory Locks.md +++ /dev/null @@ -1,212 +0,0 @@ -Feature -------- -Provide mandatory lock support for Multiprotocol environment. - -Summary -------- -POSIX.1 does not specify any scheme for mandatory locking. Whereas Linux kernel -provide support for mandatory locks based on file mode bits which is explained -at . -But the proposed feature does not adhere completely to the semantics described -by linux kernel. Instead we enforce core mandatory lock semantics at its byte -range granularity level as detailed below without the help of file mode bits. - -Owners ------- -Anoop C S - -Poornima G - -Raghavendra Talur - -Rajesh Joseph - -Current status --------------- -As of now we have POSIX locks translator loaded at the server side to handle -advisory type in-memory locks and we have the support for mandatory locks -according to linux kernel semantics. But as we move forward with respect to -integration of GlusterFS with other protocols like NFS or SMB, apart from -default POSIX style advisory locks(or even mandatory locks by linux kernel) -we will have to add support for applications to make use of mandatory locks -independent of file mode bits defined by linux kernel. This is a mandatory -requirement for GlusterFS to cope with Multiprotocol environment. - -Related Feature Requests and Bugs ---------------------------------- -* BZ 762184 - Support mandatory locking in glusterfs - - - -* BZ 1194546 - Write behind returns success for a write irrespective of a - conflicting lock held by another application. - - - -* BZ 1287099 - Race between mandatory lock request and ongoing read/write - - - -Detailed Description --------------------- -* By default, mandatory locking will be disabled for a volume. With all - the patches listed towards the end, 4 modes{'off', 'file', 'forced' and - 'optimal'} are available for the volume set option 'mandatory-locking' - which will contain the value 'off' at start. - -* Following the POSIX standard, lock requests from all glusterfs native - clients will be considered as advisory locks. If the cluster is being - accessed by only native clients, then multiple access to a single file - can be controlled by setting the group-id bit in its file mode but - removing the group-execute bit (as per mandatory locks documentation for - linux kernel). This behaviour is achieved by setting the option - for mandatory locking to 'file'(implementation not yet done). As a result - normal fcntl locks on those files whose required bits are set/unset are - considered to be mandatory in nature and thereby performs byte-range - conflict check during every data modifying fops. - -* With other protocol clients accessing the cluster, to provide advanced - protection to files residing in a volume we can use 'forced' mode to ensure - volume-wide mandatory lock behaviour which will perform a conflicting - advisory/mandatory byte-range lock check before every read/write/truncate/ - zerofill fop. - -* In a similar environment with different protocol clients accessing the - cluster, an 'optimal' mode is more suitable based on the following - semantics: - - * Each read/write/truncate will check for conflicting mandatory locks - and correspondingly blocks/allows the fop. - - * Locks from POSIX clients will be always advisory and any other POSIX - client can do read/write/truncate without conflict/overlap byte - range check. - - * No clients are allowed to read/write into regions where a conflicting/ - overlapping mandatory byte-range lock is being held by another client. - - * Any gfapi client who require a mandatory lock on a particular - byte-range will have to use the glfs_common_lock() API to do so. - -* Mandatory locks can be of two types namely shared and exclusive with - semantics similar to the current fcntl locks. The following table explains - the extra checks made during various fops acting on overlapping byte range - for a particular file: - - - +----------------------------+-------------+----------------+ - | Incoming FOP/Existing LOCK | SHARED_LOCK | EXCLUSIVE_LOCK | - +----------------------------+-------------+----------------+ - | READ | Success | Wait/Fail | - | WRITE/FTRUNCATE/ZEROFILL | Wait/Fail | Wait/Fail | - | SHARED_LOCK | Granted | Wait/Fail | - | EXCLUSIVE_LOCK | Wait/Fail | Wait/Fail | - | OPEN (with O_TRUNC) | Fail | Fail | - +----------------------------+-------------+----------------+ - * Wait = make the fop wait till the conflicting lock get released. This is - the default. - * Fail = return EAGAIN if fd flags contain O_NONBLOCK except in the case of - OPEN call with O_TRUNC flag where we return EAGAIN without checking - O_NONBLOCK. - -* Other than normal client initiated file ops, internal fops associated with - GlusterFS such as self-heal, rebalance etc will have to bypass byte-range - conflict check for mandatory locks and proceed as normal under optimal and - forced mandatory locking mode for a volume (See dependencies for more details - on problems associated with lock-healing and lock-migration w.r.t self-heal - and rebalance). - -Dependencies ------------- -1. During disconnect between client and server, locks get cleaned up on server - side. When it comes back online we does not heal fcntl locks and therefore - no way to recover locks for that particular brick. Related problems and - proposed solutions can be reviewed at the following link: - - - -2. Race window where a mandatory lock request from a new client on a particular - byte range overlapping with a write from older client is ongoing in the - backend, we end up in granting the lock request which will break the - assumption given to latter. In this scenario we will have to check for - conflicting inodelk in that particular range for which the write is being - done by older client. But this check will not satisfy ongoing read use case - since reads are not associated with inodelk. Instead we can internally take - byte range locks for fops like read, write etc to ensure mandatory lock - semantics in a much better way. Following BZ has been created to track the - issue in future: - - - -3. Considering rebalance of files we have to make sure that locks associated - with a file are also migrated to its new destination. Failure to migrate - locks may result in undesired access to files under new destination. Proposed - design can be tracked at the following link: - - - -Relevant patches ----------------- -* -* - -Benefit to GlusterFS --------------------- -This feature will allow other protocol clients like SMB or NFS accessing -gluster volumes via libgfapi to make use of mandatory locking semantics -described above. - -Scope ------ - -#### Nature of proposed change -For implementing this feature we make use of current locks translator to -accomodate necessary changes without the introduction of a new translator. It is -important to note that mandatory lock requests received when it is disabled for -a volume will be stored as it is inside locks translator. But enforcement of -advisory and mandatory lock requests will be done based on the current -mandatory-locking mode. - -#### Implications on manageability -New volume set option 'mandatory-locking' will be available accepting the -following values: -* off -* file -* optimal -* forced - -Note:- These volume set options are taken into effect only after a subsequent -start/restart of the volume. - -#### Implications on presentation layer -New locking API namely glfs_lock() will be exposed to allow applications to -apply for mandatory locks. - -#### Implications on persistence layer -None -#### Implications on 'GlusterFS' backend -None -#### Modification to GlusterFS metadata -None -#### Implications on 'glusterd' -None - -How To Test ------------ - -User Experience ---------------- - -Dependencies ------------- - -Documentation -------------- -TBD - -Status ------- -In development - -Comments and Discussion ------------------------ -- cgit