summaryrefslogtreecommitdiffstats
path: root/Feature Planning/GlusterFS 3.5/Disk Encryption.md
diff options
context:
space:
mode:
Diffstat (limited to 'Feature Planning/GlusterFS 3.5/Disk Encryption.md')
-rw-r--r--Feature Planning/GlusterFS 3.5/Disk Encryption.md443
1 files changed, 443 insertions, 0 deletions
diff --git a/Feature Planning/GlusterFS 3.5/Disk Encryption.md b/Feature Planning/GlusterFS 3.5/Disk Encryption.md
new file mode 100644
index 0000000..4c6ab89
--- /dev/null
+++ b/Feature Planning/GlusterFS 3.5/Disk Encryption.md
@@ -0,0 +1,443 @@
+Feature
+=======
+
+Transparent encryption. Allows a volume to be encrypted "at rest" on the
+server using keys only available on the client.
+
+1 Summary
+=========
+
+Distributed systems impose tighter requirements to at-rest encryption.
+This is because your encrypted data will be stored on servers, which are
+de facto untrusted. In particular, your private encrypted data can be
+subjected to analysis and tampering, which eventually will lead to its
+revealing, if it is not properly protected. Specifically, usually it is
+not enough to just encrypt data. In distributed systems serious
+protection of your personal data is possible only in conjunction with a
+special process, which is called authentication. GlusterFS provides such
+enhanced service: In GlusterFS encryption is enhanced with
+authentication. Currently we provide protection from "silent tampering".
+This is a kind of tampering, which is hard to detect, because it doesn't
+break POSIX compliance. Specifically, we protect encryption-specific
+file's metadata. Such metadata includes unique file's object id (GFID),
+cipher algorithm id, cipher block size and other attributes used by the
+encryption process.
+
+1.1 Restrictions
+----------------
+
+​1. We encrypt only file content. The feature of transparent encryption
+doesn't protect file names: they are neither encrypted, nor verified.
+Protection of file names is not so critical as protection of
+encryption-specific file's metadata: any attacks based on tampering file
+names will break POSIX compliance and result in massive corruption,
+which is easy to detect.
+
+​2. The feature of transparent encryption doesn't work in NFS-mounts of
+GlusterFS volumes: NFS's file handles introduce security issues, which
+are hard to resolve. NFS mounts of encrypted GlusterFS volumes will
+result in failed file operations (see section "Encryption in different
+types of mount sessions" for more details).
+
+​3. The feature of transparent encryption is incompatible with GlusterFS
+performance translators quick-read, write-behind and open-behind.
+
+2 Owners
+========
+
+Jeff Darcy <jdarcy@redhat.com>
+Edward Shishkin <eshishki@redhat.com>
+
+3 Current status
+================
+
+Merged to the upstream.
+
+4 Detailed Description
+======================
+
+See Summary.
+
+5 Benefit to GlusterFS
+======================
+
+Besides the justifications that have applied to on-disk encryption just
+about forever, recent events have raised awareness significantly.
+Encryption using keys that are physically present at the server leaves
+data vulnerable to physical seizure of the server. Encryption using keys
+that are kept by the same organization entity leaves data vulnerable to
+"insider threat" plus coercion or capture at the organization level. For
+many, especially various kinds of service providers, only pure
+client-side encryption provides the necessary levels of privacy and
+deniability.
+
+Competitively, other projects - most notably
+[Tahoe-LAFS](https://leastauthority.com/) - are already using recently
+heightened awareness of these issues to attract users who would be
+better served by our performance/scalability, usability, and diversity
+of interfaces. Only the lack of proper encryption holds us back in these
+cases.
+
+6 Scope
+=======
+
+6.1. Nature of proposed change
+------------------------------
+
+This is a new client-side translator, using user-provided key
+information plus information stored in xattrs to encrypt data
+transparently as it's written and decrypt when it's read.
+
+6.2. Implications on manageability
+----------------------------------
+
+User needs to manage a per-volume master key (MK). That is:
+
+​1) Generate an independent MK for every volume which is to be
+encrypted. Note, that one MK is created for the whole life of the
+volume.
+
+​2) Provide MK on the client side at every mount in accordance with the
+location, which has been specified at volume create time, or overridden
+via respective mount option (see section How To Test).
+
+​3) Keep MK between mount sessions. Note that after successful mount MK
+may be removed from the specified location. In this case user should
+retain MK safely till next mount session.
+
+MK is a 256-bit secret string, which is known only to user. Generating
+and retention of MK is in user's competence.
+
+WARNING!!! Losing MK will make content of all regular files of your
+volume inaccessible. It is possible to mount a volume with improper MK,
+however such mount sessions will allow to access only file names as they
+are not encrypted.
+
+Recommendations on MK generation
+
+MK has to be a high-entropy key, appropriately generated by a key
+derivation algorithm. One of the possible ways is using rand(1) provided
+by the OpenSSL package. You need to specify the option "-hex" for proper
+output format. For example, the next command prints a generated key to
+the standard output:
+
+ $ openssl rand -hex 32
+
+6.3. Implications on presentation layer
+---------------------------------------
+
+N/A
+
+6.4. Implications on persistence layer
+--------------------------------------
+
+N/A
+
+6.5. Implications on 'GlusterFS' backend
+----------------------------------------
+
+All encrypted files on the servers contains padding at the end of file.
+That is, size of all enDefines location of the master volume key on the
+trusted client machine.crypted files on the servers is multiple to
+cipher block size. Real file size is stored as file's xattr with the key
+"trusted.glusterfs.crypt.att.size". The translation padded-file-size -\>
+real-file-size (and backward) is performed by the crypt translator.
+
+6.6. Modification to GlusterFS metadata
+---------------------------------------
+
+Encryption-specific metadata in specified format is stored as file's
+xattr with the key "trusted.glusterfs.crypt.att.cfmt". Current format of
+metadata string is described in the slide \#27 of the following [ design
+document](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf)
+
+6.7. Options of the crypt translator
+------------------------------------
+
+- data-cipher-alg
+
+Specifies cipher algorithm for file data encryption. Currently only one
+option is available: AES\_XTS. This is hidden option.
+
+- block-size
+
+Specifies size (in bytes) of logical chunk which is encrypted as a whole
+unit in the file body. If cipher modes with initial vectors are used for
+encryption, then the initial vector gets reset for every such chunk.
+Available values are: "512", "1024", "2048" and "4096". Default value is
+"4096".
+
+- data-key-size
+
+Specifies size (in bits) of data cipher key. For AES\_XTS available
+values are: "256" and "512". Default value is "256". The larger key size
+("512") is for stronger security.
+
+- master-key
+
+Specifies pathname of the regular file, or symlink. Defines location of
+the master volume key on the trusted client machine.
+
+7 Getting Started With Crypt Translator
+=======================================
+
+​1. Create a volume <vol_name>.
+
+​2. Turn on crypt xlator:
+
+ # gluster volume set `<vol_name>` encryption on
+
+​3. Turn off performance xlators that currently encryption is
+incompatible with:
+
+ # gluster volume set <vol_name> performance.quick-read off
+ # gluster volume set <vol_name> performance.write-behind off
+ # gluster volume set <vol_name> performance.open-behind off
+
+​4. (optional) Set location of the volume master key:
+
+ # gluster volume set <vol_name> encryption.master-key <master_key_location>
+
+where <master_key_location> is an absolute pathname of the file, which
+will contain the volume master key (see section implications on
+manageability).
+
+​5. (optional) Override default options of crypt xlator:
+
+ # gluster volume set <vol_name> encryption.data-key-size <data_key_size>
+
+where <data_key_size> should have one of the following values:
+"256"(default), "512".
+
+ # gluster volume set <vol_name> encryption.block-size <block_size>
+
+where <block_size> should have one of the following values: "512",
+"1024", "2048", "4096"(default).
+
+​6. Define location of the master key on your client machine, if it
+wasn't specified at section 4 above, or you want it to be different from
+the <master_key_location>, specified at section 4.
+
+​7. On the client side make sure that the file with name
+<master_key_location> (or <master_key_new_location> defined at section
+6) exists and contains respective per-volume master key (see section
+implications on manageability). This key has to be in hex form, i.e.
+should be represented by 64 symbols from the set {'0', ..., '9', 'a',
+..., 'f'}. The key should start at the beginning of the file. All
+symbols at offsets \>= 64 are ignored.
+
+NOTE: <master_key_location> (or <master_key_new_location> defined at
+step 6) can be a symlink. In this case make sure that the target file of
+this symlink exists and contains respective per-volume master key.
+
+​8. Mount the volume <vol_name> on the client side as usual. If you
+specified a location of the master key at section 6, then use the mount
+option
+
+--xlator-option=<suffixed_vol_name>.master-key=<master_key_new_location>
+
+where <master_key_new_location> is location of master key specified at
+section 6, <suffixed_vol_name> is <vol_name> suffixed with "-crypt". For
+example, if you created a volume "myvol" in the step 1, then
+suffixed\_vol\_name is "myvol-crypt".
+
+​9. During mount your client machine receives configuration info from
+the untrusted server, so this step is extremely important! Check, that
+your volume is really encrypted, and that it is encrypted with the
+proper master key (see FAQ \#1,\#2).
+
+​10. (optional) After successful mount the file which contains master
+key may be removed. NOTE: Next mount session will require the master-key
+again. Keeping the master key between mount sessions is in user's
+competence (see section implications on manageability).
+
+8 How to test
+=============
+
+From a correctness standpoint, it's sufficient to run normal tests with
+encryption enabled. From a security standpoint, there's a whole
+discipline devoted to analysing the stored data for weaknesses, and
+engagement with practitioners of that discipline will be necessary to
+develop the right tests.
+
+9 Dependencies
+==============
+
+Crypt translator requires OpenSSL of version \>= 1.0.1
+
+10 Documentation
+================
+
+10.1 Basic design concepts
+--------------------------
+
+The basic design concepts are described in the following [pdf
+slides](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf)
+
+10.2 Procedure of security open
+-------------------------------
+
+So, in accordance with the basic design concepts above, before every
+access to a file's body (by read(2), write(2), truncate(2), etc) we need
+to make sure that the file's metadata is trusted. Otherwise, we risk to
+deal with untrusted file's data.
+
+To make sure that file's metadata is trusted, file is subjected to a
+special procedure of security open. The procedure of security open is
+performed by crypt translator at FOP-\>open() (crypt\_open) time by the
+function open\_format(). Currently this is a hardcoded composition of 2
+checks:
+
+1. verification of file's GFID by the file name;
+2. verification of file's metadata by the verified GFID;
+
+If the security open succeeds, then the cache of trusted client machine
+is replenished with file descriptor and file's inode, and user can
+access the file's content by read(2), write(2), ftruncate(2), etc.
+system calls, which accept file descriptor as argument.
+
+However, file API also allows to accept file body without opening the
+file. For example, truncate(2), which accepts pathname instead of file
+descriptor. To make sure that file's metadata is trusted, we create a
+temporal file descriptor and mandatory call crypt\_open() before
+truncating the file's body.
+
+10.3 Encryption in different types of mount sessions
+----------------------------------------------------
+
+Everything described in the section above is valid only for FUSE-mounts.
+Besides, GlusterFS also supports so-called NFS-mounts. From the
+standpoint of security the key difference between the mentioned types of
+mount sessions is that in NFS-mount sessions file operations instead of
+file name accept a so-called file handle (which is actually GFID). It
+creates problems, since the file name is a basic point for verification.
+As it follows from the section above, using the step 1, we can replenish
+the cache of trusted machine with trusted file handles (GFIDs), and
+perform a security open only by trusted GFID (by the step 2). However,
+in this case we need to make sure that there is no leaks of non-trusted
+GFIDs (and, moreover, such leaks won't be introduced by the development
+process in future). This is possible only with changed GFID format:
+everywhere in GlusterFS GFID should appear as a pair (uuid,
+is\_verified), where is\_verified is a boolean variable, which is true,
+if this GFID passed off the procedure of verification (step 1 in the
+section above).
+
+The next problem is that current NFS protocol doesn't encrypt the
+channel between NFS client and NFS server. It means that in NFS-mounts
+of GlusterFS volumes NFS client and GlusterFS client should be the same
+(trusted) machine.
+
+Taking into account the described problems, encryption in GlusterFS is
+not supported in NFS-mount sessions.
+
+10.4 Class of cipher algorithms for file data encryption that can be supported by the crypt translator
+------------------------------------------------------------------------------------------------------
+
+We'll assume that any symmetric block cipher algorithm is completely
+determined by a pair (alg\_id, mode\_id), where alg\_id is an algorithm
+defined on elementary cipher blocks (e.g. AES), and mode\_id is a mode
+of operation (e.g. ECB, XTS, etc).
+
+Technically, the crypt translator is able to support any symmetric block
+cipher algorithms via additional options of the crypt translator.
+However, in practice the set of supported algorithms is narrowed because
+of various security and organization issues. Currently we support only
+one algotithm. This is AES\_XTS.
+
+10.5 Bibliography
+-----------------
+
+1. Recommendations for for Block Cipher Modes of Operation (NIST
+ Special Publication 800-38A).
+2. Recommendation for Block Cipher Modes of Operation: The XTS-AES Mode
+ for Confidentiality on Storage Devices (NIST Special Publication
+ 800-38E).
+3. Recommendation for Key Derivation Using Pseudorandom Functions,
+ (NIST Special Publication 800-108).
+4. Recommendation for Block Cipher Modes of Operation: The CMAC Mode
+ for Authentication, (NIST Special Publication 800-38B).
+5. Recommendation for Block Cipher Modes of Operation: Methods for Key
+ Wrapping, (NIST Special Publication 800-38F).
+6. FIPS PUB 198-1 The Keyed-Hash Message Authentication Code (HMAC).
+7. David A. McGrew, John Viega "The Galois/Counter Mode of Operation
+ (GCM)".
+
+11 FAQ
+======
+
+**1. How to make sure that my volume is really encrypted?**
+
+Check the respective graph of translators on your trusted client
+machine. This graph is created at mount time and is stored by default in
+the file /usr/local/var/log/glusterfs/mountpoint.log
+
+Here "mountpoint" is the absolute name of the mountpoint, where "/" are
+replaced with "-". For example, if your volume is mounted to
+/mnt/testfs, then you'll need to check the file
+/usr/local/var/log/glusterfs/mnt-testfs.log
+
+Make sure that this graph contains the crypt translator, which looks
+like the following:
+
+ 13: volume xvol-crypt
+ 14:     type encryption/crypt
+ 15:     option master-key /home/edward/mykey
+ 16:     subvolumes xvol-dht
+ 17: end-volume
+
+**2. How to make sure that my volume is encrypted with a proper master
+key?**
+
+Check the graph of translators on your trusted client machine (see the
+FAQ\#1). Make sure that the option "master-key" of the crypt translator
+specifies correct location of the master key on your trusted client
+machine.
+
+**3. Can I change the encryption status of a volume?**
+
+You can change encryption status (enable/disable encryption) only for
+empty volumes. Otherwise it will be incorrect (you'll end with IO
+errors, data corruption and security problems). We strongly recommend to
+decide once and forever at volume creation time, whether your volume has
+to be encrypted, or not.
+
+**4. I am able to mount my encrypted volume with improper master keys
+and get list of file names for every directory. Is it normal?**
+
+Yes, it is normal. It doesn't contradict the announced functionality: we
+encrypt only file's content. File names are not encrypted, so it doesn't
+make sense to hide them on the trusted client machine.
+
+**5. What is the reason for only supporting AES-XTS? This mode is not
+using Intel's AES-NI instruction thus not utilizing hardware feature..**
+
+Distributed file systems impose tighter requirements to at-rest
+encryption. We offer more than "at-rest-encryption". We offer "at-rest
+encryption and authentication in distributed systems with non-trusted
+servers". Data and metadata on the server can be easily subjected to
+tampering and analysis with the purpose to reveal secret user's data.
+And we have to resist to this tampering by performing data and metadata
+authentication.
+
+Unfortunately, it is technically hard to implement full-fledged data
+authentication via a stackable file system (GlusterFS translator), so we
+have decided to perform a "light" authentication by using a special
+cipher mode, which is resistant to tampering. Currently OpenSSL supports
+only one such mode: this is XTS. Tampering of ciphertext created in XTS
+mode will lead to unpredictable changes in the plain text. That said,
+user will see "unpredictable gibberish" on the client side. Of course,
+this is not an "official way" to detect tampering, but this is much
+better than nothing. The "official way" (creating/checking MACs) we use
+for metadata authentication.
+
+Other modes like CBC, CFB, OFB, etc supported by OpenSSL are strongly
+not recommended for use in distributed systems with non-trusted servers.
+For example, CBC mode doesn't "survive" overwrite of a logical block in
+a file. It means that with every such overwrite (standard file system
+operation) we'll need to re-encrypt the whole(!) file with different
+key. CFB and OFB modes are sensitive to tampering: there is a way to
+perform \*predictable\* changes in plaintext, which is unacceptable.
+
+Yes, XTS is slow (at least its current implementation in OpenSSL), but
+we don't promise, that CFB, OFB with full-fledged authentication will be
+faster. So..