summaryrefslogtreecommitdiffstats
path: root/Feature Planning/GlusterFS 3.5/Disk Encryption.md
blob: 4c6ab89100ab5c1c7a6ec0b8c93055166a5ddec9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
Feature
=======

Transparent encryption. Allows a volume to be encrypted "at rest" on the
server using keys only available on the client.

1 Summary
=========

Distributed systems impose tighter requirements to at-rest encryption.
This is because your encrypted data will be stored on servers, which are
de facto untrusted. In particular, your private encrypted data can be
subjected to analysis and tampering, which eventually will lead to its
revealing, if it is not properly protected. Specifically, usually it is
not enough to just encrypt data. In distributed systems serious
protection of your personal data is possible only in conjunction with a
special process, which is called authentication. GlusterFS provides such
enhanced service: In GlusterFS encryption is enhanced with
authentication. Currently we provide protection from "silent tampering".
This is a kind of tampering, which is hard to detect, because it doesn't
break POSIX compliance. Specifically, we protect encryption-specific
file's metadata. Such metadata includes unique file's object id (GFID),
cipher algorithm id, cipher block size and other attributes used by the
encryption process.

1.1 Restrictions
----------------

​1. We encrypt only file content. The feature of transparent encryption
doesn't protect file names: they are neither encrypted, nor verified.
Protection of file names is not so critical as protection of
encryption-specific file's metadata: any attacks based on tampering file
names will break POSIX compliance and result in massive corruption,
which is easy to detect.

​2. The feature of transparent encryption doesn't work in NFS-mounts of
GlusterFS volumes: NFS's file handles introduce security issues, which
are hard to resolve. NFS mounts of encrypted GlusterFS volumes will
result in failed file operations (see section "Encryption in different
types of mount sessions" for more details).

​3. The feature of transparent encryption is incompatible with GlusterFS
performance translators quick-read, write-behind and open-behind.

2 Owners
========

Jeff Darcy <jdarcy@redhat.com>  
Edward Shishkin <eshishki@redhat.com>

3 Current status
================

Merged to the upstream.

4 Detailed Description
======================

See Summary.

5 Benefit to GlusterFS
======================

Besides the justifications that have applied to on-disk encryption just
about forever, recent events have raised awareness significantly.
Encryption using keys that are physically present at the server leaves
data vulnerable to physical seizure of the server. Encryption using keys
that are kept by the same organization entity leaves data vulnerable to
"insider threat" plus coercion or capture at the organization level. For
many, especially various kinds of service providers, only pure
client-side encryption provides the necessary levels of privacy and
deniability.

Competitively, other projects - most notably
[Tahoe-LAFS](https://leastauthority.com/) - are already using recently
heightened awareness of these issues to attract users who would be
better served by our performance/scalability, usability, and diversity
of interfaces. Only the lack of proper encryption holds us back in these
cases.

6 Scope
=======

6.1. Nature of proposed change
------------------------------

This is a new client-side translator, using user-provided key
information plus information stored in xattrs to encrypt data
transparently as it's written and decrypt when it's read.

6.2. Implications on manageability
----------------------------------

User needs to manage a per-volume master key (MK). That is:

​1) Generate an independent MK for every volume which is to be
encrypted. Note, that one MK is created for the whole life of the
volume.

​2) Provide MK on the client side at every mount in accordance with the
location, which has been specified at volume create time, or overridden
via respective mount option (see section How To Test).

​3) Keep MK between mount sessions. Note that after successful mount MK
may be removed from the specified location. In this case user should
retain MK safely till next mount session.

MK is a 256-bit secret string, which is known only to user. Generating
and retention of MK is in user's competence.

WARNING!!! Losing MK will make content of all regular files of your
volume inaccessible. It is possible to mount a volume with improper MK,
however such mount sessions will allow to access only file names as they
are not encrypted.

Recommendations on MK generation

MK has to be a high-entropy key, appropriately generated by a key
derivation algorithm. One of the possible ways is using rand(1) provided
by the OpenSSL package. You need to specify the option "-hex" for proper
output format. For example, the next command prints a generated key to
the standard output:

		$ openssl rand -hex 32

6.3. Implications on presentation layer
---------------------------------------

N/A

6.4. Implications on persistence layer
--------------------------------------

N/A

6.5. Implications on 'GlusterFS' backend
----------------------------------------

All encrypted files on the servers contains padding at the end of file.
That is, size of all enDefines location of the master volume key on the
trusted client machine.crypted files on the servers is multiple to
cipher block size. Real file size is stored as file's xattr with the key
"trusted.glusterfs.crypt.att.size". The translation padded-file-size -\>
real-file-size (and backward) is performed by the crypt translator.

6.6. Modification to GlusterFS metadata
---------------------------------------

Encryption-specific metadata in specified format is stored as file's
xattr with the key "trusted.glusterfs.crypt.att.cfmt". Current format of
metadata string is described in the slide \#27 of the following [ design
document](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf)

6.7. Options of the crypt translator
------------------------------------

-   data-cipher-alg

Specifies cipher algorithm for file data encryption. Currently only one
option is available: AES\_XTS. This is hidden option.

-   block-size

Specifies size (in bytes) of logical chunk which is encrypted as a whole
unit in the file body. If cipher modes with initial vectors are used for
encryption, then the initial vector gets reset for every such chunk.
Available values are: "512", "1024", "2048" and "4096". Default value is
"4096".

-   data-key-size

Specifies size (in bits) of data cipher key. For AES\_XTS available
values are: "256" and "512". Default value is "256". The larger key size
("512") is for stronger security.

-   master-key

Specifies pathname of the regular file, or symlink. Defines location of
the master volume key on the trusted client machine.

7 Getting Started With Crypt Translator
=======================================

​1. Create a volume <vol_name>.

​2. Turn on crypt xlator:

		# gluster volume set `<vol_name>` encryption on

​3. Turn off performance xlators that currently encryption is
incompatible with:

		# gluster volume set <vol_name> performance.quick-read off
		# gluster volume set <vol_name> performance.write-behind off
		# gluster volume set <vol_name> performance.open-behind off

​4. (optional) Set location of the volume master key:

		# gluster volume set <vol_name> encryption.master-key <master_key_location>

where <master_key_location> is an absolute pathname of the file, which
will contain the volume master key (see section implications on
manageability).

​5. (optional) Override default options of crypt xlator:

		# gluster volume set <vol_name> encryption.data-key-size <data_key_size>

where <data_key_size> should have one of the following values:
"256"(default), "512".

		# gluster volume set <vol_name> encryption.block-size <block_size>

where <block_size> should have one of the following values: "512",
"1024", "2048", "4096"(default).

​6. Define location of the master key on your client machine, if it
wasn't specified at section 4 above, or you want it to be different from
the <master_key_location>, specified at section 4.

​7. On the client side make sure that the file with name
<master_key_location> (or <master_key_new_location> defined at section
6) exists and contains respective per-volume master key (see section
implications on manageability). This key has to be in hex form, i.e.
should be represented by 64 symbols from the set {'0', ..., '9', 'a',
..., 'f'}. The key should start at the beginning of the file. All
symbols at offsets \>= 64 are ignored.

NOTE: <master_key_location> (or <master_key_new_location> defined at
step 6) can be a symlink. In this case make sure that the target file of
this symlink exists and contains respective per-volume master key.

​8. Mount the volume <vol_name> on the client side as usual. If you
specified a location of the master key at section 6, then use the mount
option

--xlator-option=<suffixed_vol_name>.master-key=<master_key_new_location>

where <master_key_new_location> is location of master key specified at
section 6, <suffixed_vol_name> is <vol_name> suffixed with "-crypt". For
example, if you created a volume "myvol" in the step 1, then
suffixed\_vol\_name is "myvol-crypt".

​9. During mount your client machine receives configuration info from
the untrusted server, so this step is extremely important! Check, that
your volume is really encrypted, and that it is encrypted with the
proper master key (see FAQ \#1,\#2).

​10. (optional) After successful mount the file which contains master
key may be removed. NOTE: Next mount session will require the master-key
again. Keeping the master key between mount sessions is in user's
competence (see section implications on manageability).

8 How to test
=============

From a correctness standpoint, it's sufficient to run normal tests with
encryption enabled. From a security standpoint, there's a whole
discipline devoted to analysing the stored data for weaknesses, and
engagement with practitioners of that discipline will be necessary to
develop the right tests.

9 Dependencies
==============

Crypt translator requires OpenSSL of version \>= 1.0.1

10 Documentation
================

10.1 Basic design concepts
--------------------------

The basic design concepts are described in the following [pdf
slides](http://www.gluster.org/community/documentation/index.php/File:GlusterFS_transparent_encryption.pdf)

10.2 Procedure of security open
-------------------------------

So, in accordance with the basic design concepts above, before every
access to a file's body (by read(2), write(2), truncate(2), etc) we need
to make sure that the file's metadata is trusted. Otherwise, we risk to
deal with untrusted file's data.

To make sure that file's metadata is trusted, file is subjected to a
special procedure of security open. The procedure of security open is
performed by crypt translator at FOP-\>open() (crypt\_open) time by the
function open\_format(). Currently this is a hardcoded composition of 2
checks:

1.  verification of file's GFID by the file name;
2.  verification of file's metadata by the verified GFID;

If the security open succeeds, then the cache of trusted client machine
is replenished with file descriptor and file's inode, and user can
access the file's content by read(2), write(2), ftruncate(2), etc.
system calls, which accept file descriptor as argument.

However, file API also allows to accept file body without opening the
file. For example, truncate(2), which accepts pathname instead of file
descriptor. To make sure that file's metadata is trusted, we create a
temporal file descriptor and mandatory call crypt\_open() before
truncating the file's body.

10.3 Encryption in different types of mount sessions
----------------------------------------------------

Everything described in the section above is valid only for FUSE-mounts.
Besides, GlusterFS also supports so-called NFS-mounts. From the
standpoint of security the key difference between the mentioned types of
mount sessions is that in NFS-mount sessions file operations instead of
file name accept a so-called file handle (which is actually GFID). It
creates problems, since the file name is a basic point for verification.
As it follows from the section above, using the step 1, we can replenish
the cache of trusted machine with trusted file handles (GFIDs), and
perform a security open only by trusted GFID (by the step 2). However,
in this case we need to make sure that there is no leaks of non-trusted
GFIDs (and, moreover, such leaks won't be introduced by the development
process in future). This is possible only with changed GFID format:
everywhere in GlusterFS GFID should appear as a pair (uuid,
is\_verified), where is\_verified is a boolean variable, which is true,
if this GFID passed off the procedure of verification (step 1 in the
section above).

The next problem is that current NFS protocol doesn't encrypt the
channel between NFS client and NFS server. It means that in NFS-mounts
of GlusterFS volumes NFS client and GlusterFS client should be the same
(trusted) machine.

Taking into account the described problems, encryption in GlusterFS is
not supported in NFS-mount sessions.

10.4 Class of cipher algorithms for file data encryption that can be supported by the crypt translator
------------------------------------------------------------------------------------------------------

We'll assume that any symmetric block cipher algorithm is completely
determined by a pair (alg\_id, mode\_id), where alg\_id is an algorithm
defined on elementary cipher blocks (e.g. AES), and mode\_id is a mode
of operation (e.g. ECB, XTS, etc).

Technically, the crypt translator is able to support any symmetric block
cipher algorithms via additional options of the crypt translator.
However, in practice the set of supported algorithms is narrowed because
of various security and organization issues. Currently we support only
one algotithm. This is AES\_XTS.

10.5 Bibliography
-----------------

1.  Recommendations for for Block Cipher Modes of Operation (NIST
    Special Publication 800-38A).
2.  Recommendation for Block Cipher Modes of Operation: The XTS-AES Mode
    for Confidentiality on Storage Devices (NIST Special Publication
    800-38E).
3.  Recommendation for Key Derivation Using Pseudorandom Functions,
    (NIST Special Publication 800-108).
4.  Recommendation for Block Cipher Modes of Operation: The CMAC Mode
    for Authentication, (NIST Special Publication 800-38B).
5.  Recommendation for Block Cipher Modes of Operation: Methods for Key
    Wrapping, (NIST Special Publication 800-38F).
6.  FIPS PUB 198-1 The Keyed-Hash Message Authentication Code (HMAC).
7.  David A. McGrew, John Viega "The Galois/Counter Mode of Operation
    (GCM)".

11 FAQ
======

**1. How to make sure that my volume is really encrypted?**

Check the respective graph of translators on your trusted client
machine. This graph is created at mount time and is stored by default in
the file /usr/local/var/log/glusterfs/mountpoint.log

Here "mountpoint" is the absolute name of the mountpoint, where "/" are
replaced with "-". For example, if your volume is mounted to
/mnt/testfs, then you'll need to check the file
/usr/local/var/log/glusterfs/mnt-testfs.log

Make sure that this graph contains the crypt translator, which looks
like the following:

		13: volume xvol-crypt
		14:     type encryption/crypt
		15:     option master-key /home/edward/mykey
		16:     subvolumes xvol-dht
		17: end-volume

**2. How to make sure that my volume is encrypted with a proper master
key?**

Check the graph of translators on your trusted client machine (see the
FAQ\#1). Make sure that the option "master-key" of the crypt translator
specifies correct location of the master key on your trusted client
machine.

**3. Can I change the encryption status of a volume?**

You can change encryption status (enable/disable encryption) only for
empty volumes. Otherwise it will be incorrect (you'll end with IO
errors, data corruption and security problems). We strongly recommend to
decide once and forever at volume creation time, whether your volume has
to be encrypted, or not.

**4. I am able to mount my encrypted volume with improper master keys
and get list of file names for every directory. Is it normal?**

Yes, it is normal. It doesn't contradict the announced functionality: we
encrypt only file's content. File names are not encrypted, so it doesn't
make sense to hide them on the trusted client machine.

**5. What is the reason for only supporting AES-XTS? This mode is not
using Intel's AES-NI instruction thus not utilizing hardware feature..**

Distributed file systems impose tighter requirements to at-rest
encryption. We offer more than "at-rest-encryption". We offer "at-rest
encryption and authentication in distributed systems with non-trusted
servers". Data and metadata on the server can be easily subjected to
tampering and analysis with the purpose to reveal secret user's data.
And we have to resist to this tampering by performing data and metadata
authentication.

Unfortunately, it is technically hard to implement full-fledged data
authentication via a stackable file system (GlusterFS translator), so we
have decided to perform a "light" authentication by using a special
cipher mode, which is resistant to tampering. Currently OpenSSL supports
only one such mode: this is XTS. Tampering of ciphertext created in XTS
mode will lead to unpredictable changes in the plain text. That said,
user will see "unpredictable gibberish" on the client side. Of course,
this is not an "official way" to detect tampering, but this is much
better than nothing. The "official way" (creating/checking MACs) we use
for metadata authentication.

Other modes like CBC, CFB, OFB, etc supported by OpenSSL are strongly
not recommended for use in distributed systems with non-trusted servers.
For example, CBC mode doesn't "survive" overwrite of a logical block in
a file. It means that with every such overwrite (standard file system
operation) we'll need to re-encrypt the whole(!) file with different
key. CFB and OFB modes are sensitive to tampering: there is a way to
perform \*predictable\* changes in plaintext, which is unacceptable.

Yes, XTS is slow (at least its current implementation in OpenSSL), but
we don't promise, that CFB, OFB with full-fledged authentication will be
faster. So..