Feature ======= Transparent encryption. Allows a volume to be encrypted "at rest" on the server using keys only available on the client. 1 Summary ========= Distributed systems impose tighter requirements to at-rest encryption. This is because your encrypted data will be stored on servers, which are de facto untrusted. In particular, your private encrypted data can be subjected to analysis and tampering, which eventually will lead to its revealing, if it is not properly protected. Specifically, usually it is not enough to just encrypt data. In distributed systems serious protection of your personal data is possible only in conjunction with a special process, which is called authentication. GlusterFS provides such enhanced service: In GlusterFS encryption is enhanced with authentication. Currently we provide protection from "silent tampering". This is a kind of tampering, which is hard to detect, because it doesn't break POSIX compliance. Specifically, we protect encryption-specific file's metadata. Such metadata includes unique file's object id (GFID), cipher algorithm id, cipher block size and other attributes used by the encryption process. 1.1 Restrictions ---------------- ​1. We encrypt only file content. The feature of transparent encryption doesn't protect file names: they are neither encrypted, nor verified. Protection of file names is not so critical as protection of encryption-specific file's metadata: any attacks based on tampering file names will break POSIX compliance and result in massive corruption, which is easy to detect. ​2. The feature of transparent encryption doesn't work in NFS-mounts of GlusterFS volumes: NFS's file handles introduce security issues, which are hard to resolve. NFS mounts of encrypted GlusterFS volumes will result in failed file operations (see section "Encryption in different types of mount sessions" for more details). ​3. The feature of transparent encryption is incompatible with GlusterFS performance translators quick-read, write-behind and open-behind. 2 Owners ======== Jeff Darcy Edward Shishkin 3 Current status ================ Merged to the upstream. 4 Detailed Description ====================== See Summary. 5 Benefit to GlusterFS ====================== Besides the justifications that have applied to on-disk encryption just about forever, recent events have raised awareness significantly. Encryption using keys that are physically present at the server leaves data vulnerable to physical seizure of the server. Encryption using keys that are kept by the same organization entity leaves data vulnerable to "insider threat" plus coercion or capture at the organization level. For many, especially various kinds of service providers, only pure client-side encryption provides the necessary levels of privacy and deniability. Competitively, other projects - most notably [Tahoe-LAFS](https://leastauthority.com/) - are already using recently heightened awareness of these issues to attract users who would be better served by our performance/scalability, usability, and diversity of interfaces. Only the lack of proper encryption holds us back in these cases. 6 Scope ======= 6.1. Nature of proposed change ------------------------------ This is a new client-side translator, using user-provided key information plus information stored in xattrs to encrypt data transparently as it's written and decrypt when it's read. 6.2. Implications on manageability ---------------------------------- User needs to manage a per-volume master key (MK). That is: ​1) Generate an independent MK for every volume which is to be encrypted. Note, that one MK is created for the whole life of the volume. ​2) Provide MK on the client side at every mount in accordance with the location, which has been specified at volume create time, or overridden via respective mount option (see section How To Test). ​3) Keep MK between mount sessions. Note that after successful mount MK may be removed from the specified location. In this case user should retain MK safely till next mount session. MK is a 256-bit secret string, which is known only to user. Generating and retention of MK is in user's competence. WARNING!!! Losing MK will make content of all regular files of your volume inaccessible. It is possible to mount a volume with improper MK, however such mount sessions will allow to access only file names as they are not encrypted. Recommendations on MK generation MK has to be a high-entropy key, appropriately generated by a key derivation algorithm. One of the possible ways is using rand(1) provided by the OpenSSL package. You need to specify the option "-hex" for proper output format. For example, the next command prints a generated key to the standard output: $ openssl rand -hex 32 6.3. Implications on presentation layer --------------------------------------- N/A 6.4. Implications on persistence layer -------------------------------------- N/A 6.5. Implications on 'GlusterFS' backend ---------------------------------------- All encrypted files on the servers contains padding at the end of file. That is, size of all enDefines location of the master volume key on the trusted client machine.crypted files on the servers is multiple to cipher block size. Real file size is stored as file's xattr with the key "trusted.glusterfs.crypt.att.size". The translation padded-file-size -\> real-file-size (and backward) is performed by the crypt translator. 6.6. Modification to GlusterFS metadata --------------------------------------- Encryption-specific metadata in specified format is stored as file's xattr with the key "trusted.glusterfs.crypt.att.cfmt". Current format of metadata string is described in the slide \#27 of the following [ design document](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf) 6.7. Options of the crypt translator ------------------------------------ - data-cipher-alg Specifies cipher algorithm for file data encryption. Currently only one option is available: AES\_XTS. This is hidden option. - block-size Specifies size (in bytes) of logical chunk which is encrypted as a whole unit in the file body. If cipher modes with initial vectors are used for encryption, then the initial vector gets reset for every such chunk. Available values are: "512", "1024", "2048" and "4096". Default value is "4096". - data-key-size Specifies size (in bits) of data cipher key. For AES\_XTS available values are: "256" and "512". Default value is "256". The larger key size ("512") is for stronger security. - master-key Specifies pathname of the regular file, or symlink. Defines location of the master volume key on the trusted client machine. 7 Getting Started With Crypt Translator ======================================= ​1. Create a volume . ​2. Turn on crypt xlator: # gluster volume set `` encryption on ​3. Turn off performance xlators that currently encryption is incompatible with: # gluster volume set  performance.quick-read off # gluster volume set  performance.write-behind off # gluster volume set  performance.open-behind off ​4. (optional) Set location of the volume master key: # gluster volume set  encryption.master-key  where is an absolute pathname of the file, which will contain the volume master key (see section implications on manageability). ​5. (optional) Override default options of crypt xlator: # gluster volume set  encryption.data-key-size  where should have one of the following values: "256"(default), "512". # gluster volume set  encryption.block-size  where should have one of the following values: "512", "1024", "2048", "4096"(default). ​6. Define location of the master key on your client machine, if it wasn't specified at section 4 above, or you want it to be different from the , specified at section 4. ​7. On the client side make sure that the file with name (or defined at section 6) exists and contains respective per-volume master key (see section implications on manageability). This key has to be in hex form, i.e. should be represented by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}. The key should start at the beginning of the file. All symbols at offsets \>= 64 are ignored. NOTE: (or defined at step 6) can be a symlink. In this case make sure that the target file of this symlink exists and contains respective per-volume master key. ​8. Mount the volume on the client side as usual. If you specified a location of the master key at section 6, then use the mount option --xlator-option=.master-key= where is location of master key specified at section 6, is suffixed with "-crypt". For example, if you created a volume "myvol" in the step 1, then suffixed\_vol\_name is "myvol-crypt". ​9. During mount your client machine receives configuration info from the untrusted server, so this step is extremely important! Check, that your volume is really encrypted, and that it is encrypted with the proper master key (see FAQ \#1,\#2). ​10. (optional) After successful mount the file which contains master key may be removed. NOTE: Next mount session will require the master-key again. Keeping the master key between mount sessions is in user's competence (see section implications on manageability). 8 How to test ============= From a correctness standpoint, it's sufficient to run normal tests with encryption enabled. From a security standpoint, there's a whole discipline devoted to analysing the stored data for weaknesses, and engagement with practitioners of that discipline will be necessary to develop the right tests. 9 Dependencies ============== Crypt translator requires OpenSSL of version \>= 1.0.1 10 Documentation ================ 10.1 Basic design concepts -------------------------- The basic design concepts are described in the following [pdf slides](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf) 10.2 Procedure of security open ------------------------------- So, in accordance with the basic design concepts above, before every access to a file's body (by read(2), write(2), truncate(2), etc) we need to make sure that the file's metadata is trusted. Otherwise, we risk to deal with untrusted file's data. To make sure that file's metadata is trusted, file is subjected to a special procedure of security open. The procedure of security open is performed by crypt translator at FOP-\>open() (crypt\_open) time by the function open\_format(). Currently this is a hardcoded composition of 2 checks: 1. verification of file's GFID by the file name; 2. verification of file's metadata by the verified GFID; If the security open succeeds, then the cache of trusted client machine is replenished with file descriptor and file's inode, and user can access the file's content by read(2), write(2), ftruncate(2), etc. system calls, which accept file descriptor as argument. However, file API also allows to accept file body without opening the file. For example, truncate(2), which accepts pathname instead of file descriptor. To make sure that file's metadata is trusted, we create a temporal file descriptor and mandatory call crypt\_open() before truncating the file's body. 10.3 Encryption in different types of mount sessions ---------------------------------------------------- Everything described in the section above is valid only for FUSE-mounts. Besides, GlusterFS also supports so-called NFS-mounts. From the standpoint of security the key difference between the mentioned types of mount sessions is that in NFS-mount sessions file operations instead of file name accept a so-called file handle (which is actually GFID). It creates problems, since the file name is a basic point for verification. As it follows from the section above, using the step 1, we can replenish the cache of trusted machine with trusted file handles (GFIDs), and perform a security open only by trusted GFID (by the step 2). However, in this case we need to make sure that there is no leaks of non-trusted GFIDs (and, moreover, such leaks won't be introduced by the development process in future). This is possible only with changed GFID format: everywhere in GlusterFS GFID should appear as a pair (uuid, is\_verified), where is\_verified is a boolean variable, which is true, if this GFID passed off the procedure of verification (step 1 in the section above). The next problem is that current NFS protocol doesn't encrypt the channel between NFS client and NFS server. It means that in NFS-mounts of GlusterFS volumes NFS client and GlusterFS client should be the same (trusted) machine. Taking into account the described problems, encryption in GlusterFS is not supported in NFS-mount sessions. 10.4 Class of cipher algorithms for file data encryption that can be supported by the crypt translator ------------------------------------------------------------------------------------------------------ We'll assume that any symmetric block cipher algorithm is completely determined by a pair (alg\_id, mode\_id), where alg\_id is an algorithm defined on elementary cipher blocks (e.g. AES), and mode\_id is a mode of operation (e.g. ECB, XTS, etc). Technically, the crypt translator is able to support any symmetric block cipher algorithms via additional options of the crypt translator. However, in practice the set of supported algorithms is narrowed because of various security and organization issues. Currently we support only one algotithm. This is AES\_XTS. 10.5 Bibliography ----------------- 1. Recommendations for for Block Cipher Modes of Operation (NIST Special Publication 800-38A). 2. Recommendation for Block Cipher Modes of Operation: The XTS-AES Mode for Confidentiality on Storage Devices (NIST Special Publication 800-38E). 3. Recommendation for Key Derivation Using Pseudorandom Functions, (NIST Special Publication 800-108). 4. Recommendation for Block Cipher Modes of Operation: The CMAC Mode for Authentication, (NIST Special Publication 800-38B). 5. Recommendation for Block Cipher Modes of Operation: Methods for Key Wrapping, (NIST Special Publication 800-38F). 6. FIPS PUB 198-1 The Keyed-Hash Message Authentication Code (HMAC). 7. David A. McGrew, John Viega "The Galois/Counter Mode of Operation (GCM)". 11 FAQ ====== **1. How to make sure that my volume is really encrypted?** Check the respective graph of translators on your trusted client machine. This graph is created at mount time and is stored by default in the file /usr/local/var/log/glusterfs/mountpoint.log Here "mountpoint" is the absolute name of the mountpoint, where "/" are replaced with "-". For example, if your volume is mounted to /mnt/testfs, then you'll need to check the file /usr/local/var/log/glusterfs/mnt-testfs.log Make sure that this graph contains the crypt translator, which looks like the following: 13: volume xvol-crypt 14:     type encryption/crypt 15:     option master-key /home/edward/mykey 16:     subvolumes xvol-dht 17: end-volume **2. How to make sure that my volume is encrypted with a proper master key?** Check the graph of translators on your trusted client machine (see the FAQ\#1). Make sure that the option "master-key" of the crypt translator specifies correct location of the master key on your trusted client machine. **3. Can I change the encryption status of a volume?** You can change encryption status (enable/disable encryption) only for empty volumes. Otherwise it will be incorrect (you'll end with IO errors, data corruption and security problems). We strongly recommend to decide once and forever at volume creation time, whether your volume has to be encrypted, or not. **4. I am able to mount my encrypted volume with improper master keys and get list of file names for every directory. Is it normal?** Yes, it is normal. It doesn't contradict the announced functionality: we encrypt only file's content. File names are not encrypted, so it doesn't make sense to hide them on the trusted client machine. **5. What is the reason for only supporting AES-XTS? This mode is not using Intel's AES-NI instruction thus not utilizing hardware feature..** Distributed file systems impose tighter requirements to at-rest encryption. We offer more than "at-rest-encryption". We offer "at-rest encryption and authentication in distributed systems with non-trusted servers". Data and metadata on the server can be easily subjected to tampering and analysis with the purpose to reveal secret user's data. And we have to resist to this tampering by performing data and metadata authentication. Unfortunately, it is technically hard to implement full-fledged data authentication via a stackable file system (GlusterFS translator), so we have decided to perform a "light" authentication by using a special cipher mode, which is resistant to tampering. Currently OpenSSL supports only one such mode: this is XTS. Tampering of ciphertext created in XTS mode will lead to unpredictable changes in the plain text. That said, user will see "unpredictable gibberish" on the client side. Of course, this is not an "official way" to detect tampering, but this is much better than nothing. The "official way" (creating/checking MACs) we use for metadata authentication. Other modes like CBC, CFB, OFB, etc supported by OpenSSL are strongly not recommended for use in distributed systems with non-trusted servers. For example, CBC mode doesn't "survive" overwrite of a logical block in a file. It means that with every such overwrite (standard file system operation) we'll need to re-encrypt the whole(!) file with different key. CFB and OFB modes are sensitive to tampering: there is a way to perform \*predictable\* changes in plaintext, which is unacceptable. Yes, XTS is slow (at least its current implementation in OpenSSL), but we don't promise, that CFB, OFB with full-fledged authentication will be faster. So..