diff options
author | Edward Shishkin <edward@redhat.com> | 2013-03-13 21:56:46 +0100 |
---|---|---|
committer | Anand Avati <avati@redhat.com> | 2013-11-13 15:12:49 -0800 |
commit | 4efbff29e773a8c59605f87bc3939c9c71b9da16 (patch) | |
tree | 3f0ac8f9c628de459a6c1fdc4f00415e4f9d743e | |
parent | 98e796e50198945adc660e42f3f5ab5b668f7bba (diff) |
Transparent data encryption and metadata authentication
.. in the systems with non-trusted server
This new functionality can be useful in various cloud technologies.
It is implemented via a special encryption/crypt translator,which
works on the client side and performs encryption and authentication;
1. Class of supported algorithms
The crypt translator can support any atomic symmetric block cipher
algorithms (which require to pad plain/cipher text before performing
encryption/decryption transform (see glossary in atom.c for
definitions). In particular, it can support algorithms with the EOF
issue (which require to pad the end of file by extra-data).
Crypt translator performs translations
user -> (offset, size) -> (aligned-offset, padded-size) ->server
(and backward), and resolves individual FOPs (write(), truncate(),
etc) to read-modify-write sequences.
A volume can contain files encrypted by different algorithms of the
mentioned class. To change some option value just reconfigure the
volume.
Currently only one algorithm is supported: AES_XTS.
Example of algorithms, which can not be supported by the crypt
translator:
1. Asymmetric block cipher algorithms, which inflate data, e.g. RSA;
2. Symmetric block cipher algorithms with inline MACs for data
authentication.
2. Implementation notes.
a) Atomic algorithms
Since any process in a stackable file system manipulates with local
data (which can be obsoleted by local data of another process), any
atomic cipher algorithm without proper support can lead to non-POSIX
behavior. To resolve the "collisions" we introduce locks: before
performing FOP->read(), FOP->write(), etc. the process should first
lock the file.
b) Algorithms with EOF issue
Such algorithms require to pad the end of file with some extra-data.
Without proper support this will result in losing information about
real file size. Keeping a track of real file size is a responsibility
of the crypt translator. A special extended attribute with the name
"trusted.glusterfs.crypt.att.size" is used for this purpose. All files
contained in bricks of encrypted volume do have "padded" sizes.
3. Non-trusted servers and
Metadata authentication
We assume that server, where user's data is stored on is non-trusted.
It means that the server can be subjected to various attacks directed
to reveal user's encrypted personal data. We provide protection
against such attacks.
Every encrypted file has specific private attributes (cipher algorithm
id, atom size, etc), which are packed to a string (so-called "format
string") and stored as a special extended attribute with the name
"trusted.glusterfs.crypt.att.cfmt". We protect the string from
tampering. This protection is mandatory, hardcoded and is always on.
Without such protection various attacks (based on extending the scope
of per-file secret keys) are possible.
Our authentication method has been developed in tight collaboration
with Red Hat security team and is implemented as "metadata loader of
version 1" (see file metadata.c). This method is NIST-compliant and is
based on checking 8-byte per-hardlink MACs created(updated) by
FOP->create(), FOP->link(), FOP->unlink(), FOP->rename() by the
following unique entities:
. file (hardlink) name;
. verified file's object id (gfid).
Every time, before manipulating with a file, we check it's MACs at
FOP->open() time. Some FOPs don't require a file to be opened (e.g.
FOP->truncate()). In such cases the crypt translator opens the file
mandatory.
4. Generating keys
Unique per-file keys are derived by NIST-compliant methods from the
a) parent key;
b) unique verified object-id of the file (gfid);
Per-volume master key, provided by user at mount time is in the root
of this "tree of keys".
Those keys are used to:
1) encrypt/decrypt file data;
2) encrypt/decrypt file metadata;
3) create per-file and per-link MACs for metadata authentication.
5. Instructions
Getting started with crypt translator
Example:
1) Create a volume "myvol" and enable encryption:
# gluster volume create myvol pepelac:/vols/xvol
# gluster volume set myvol encryption on
2) Set location (absolute pathname) of your master key:
# gluster volume set myvol encryption.master-key /home/me/mykey
3) Set other options to override default options, if needed.
Start the volume.
4) On the client side make sure that the file /home/me/mykey exists
and contains proper per-volume master key (that is 256-bit AES
key). This key has to be in hex form, i.e. should be represented
by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}.
The key should start at the beginning of the file. All symbols at
offsets >= 64 are ignored.
5) Mount the volume "myvol" on the client side:
# glusterfs --volfile-server=pepelac --volfile-id=myvol /mnt
After successful mount the file which contains master key may be
removed. NOTE: Keeping the master key between mount sessions is in
user's competence.
**********************************************************************
WARNING! Losing the master key will make content of all regular files
inaccessible. Mount with improper master key allows to access content
of directories: file names are not encrypted.
**********************************************************************
6. Options of crypt translator
1) "master-key": specifies location (absolute pathname) of the file
which contains per-volume master key. There is no default location
for master key.
2) "data-key-size": specifies size of per-file key for data encryption
Possible values:
. "256" default value
. "512"
3) "block-size": specifies atom size. Possible values:
. "512"
. "1024"
. "2048"
. "4096" default value;
7. Test cases
Any workload, which involves the following file operations:
->create();
->open();
->readv();
->writev();
->truncate();
->ftruncate();
->link();
->unlink();
->rename();
->readdirp().
8. TODOs:
1) Currently size of IOs issued by crypt translator is restricted
by block_size (4K by default). We can use larger IOs to improve
performance.
Change-Id: I2601fe95c5c4dc5b22308a53d0cbdc071d5e5cee
BUG: 1030058
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4667
Tested-by: Gluster Build System <jenkins@build.gluster.com>
-rw-r--r-- | configure.ac | 23 | ||||
-rw-r--r-- | xlators/encryption/Makefile.am | 2 | ||||
-rw-r--r-- | xlators/encryption/crypt/Makefile.am | 3 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/Makefile.am | 24 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/atom.c | 962 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/crypt-common.h | 141 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/crypt-mem-types.h | 43 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/crypt.c | 4498 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/crypt.h | 899 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/data.c | 769 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/keys.c | 302 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/metadata.c | 605 | ||||
-rw-r--r-- | xlators/encryption/crypt/src/metadata.h | 74 | ||||
-rw-r--r-- | xlators/mgmt/glusterd/src/glusterd-volgen.c | 47 | ||||
-rw-r--r-- | xlators/mgmt/glusterd/src/glusterd-volume-set.c | 28 |
15 files changed, 8408 insertions, 12 deletions
diff --git a/configure.ac b/configure.ac index 0cecafb46dc..b3d1ed184c1 100644 --- a/configure.ac +++ b/configure.ac @@ -131,6 +131,8 @@ AC_CONFIG_FILES([Makefile xlators/encryption/Makefile xlators/encryption/rot-13/Makefile xlators/encryption/rot-13/src/Makefile + xlators/encryption/crypt/Makefile + xlators/encryption/crypt/src/Makefile xlators/features/qemu-block/Makefile xlators/features/qemu-block/src/Makefile xlators/system/Makefile @@ -340,6 +342,26 @@ fi AM_CONDITIONAL([ENABLE_BD_XLATOR], [test x$BUILD_BD_XLATOR = xyes]) +# start encryption/crypt section + +AC_CHECK_HEADERS([openssl/cmac.h], [have_cmac_h=yes], [have_cmac_h=no]) + +AC_ARG_ENABLE([crypt-xlator], + AC_HELP_STRING([--enable-crypt-xlator], [Build crypt encryption xlator])) + +if test "x$enable_crypt_xlator" = "xyes" -a "x$have_cmac_h" = "xno"; then + echo "Encryption xlator requires OpenSSL with cmac.h" + exit 1 +fi + +BUILD_CRYPT_XLATOR=no +if test "x$enable_crypt_xlator" != "xno" -a "x$have_cmac_h" = "xyes"; then + BUILD_CRYPT_XLATOR=yes + AC_DEFINE(HAVE_CRYPT_XLATOR, 1, [enable building crypt encryption xlator]) +fi + +AM_CONDITIONAL([ENABLE_CRYPT_XLATOR], [test x$BUILD_CRYPT_XLATOR = xyes]) + AC_SUBST(FUSE_CLIENT_SUBDIR) # end FUSE section @@ -865,4 +887,5 @@ echo "glupy : $BUILD_GLUPY" echo "Use syslog : $USE_SYSLOG" echo "XML output : $BUILD_XML_OUTPUT" echo "QEMU Block formats : $BUILD_QEMU_BLOCK" +echo "Encryption xlator : $BUILD_CRYPT_XLATOR" echo diff --git a/xlators/encryption/Makefile.am b/xlators/encryption/Makefile.am index 2cbde680fac..36efc6698bd 100644 --- a/xlators/encryption/Makefile.am +++ b/xlators/encryption/Makefile.am @@ -1,3 +1,3 @@ -SUBDIRS = rot-13 +SUBDIRS = rot-13 crypt CLEANFILES = diff --git a/xlators/encryption/crypt/Makefile.am b/xlators/encryption/crypt/Makefile.am new file mode 100644 index 00000000000..d471a3f9243 --- /dev/null +++ b/xlators/encryption/crypt/Makefile.am @@ -0,0 +1,3 @@ +SUBDIRS = src + +CLEANFILES = diff --git a/xlators/encryption/crypt/src/Makefile.am b/xlators/encryption/crypt/src/Makefile.am new file mode 100644 index 00000000000..faadd117fad --- /dev/null +++ b/xlators/encryption/crypt/src/Makefile.am @@ -0,0 +1,24 @@ +if ENABLE_CRYPT_XLATOR + +xlator_LTLIBRARIES = crypt.la +xlatordir = $(libdir)/glusterfs/$(PACKAGE_VERSION)/xlator/encryption + +crypt_la_LDFLAGS = -module -avoidversion -lssl -lcrypto + +crypt_la_SOURCES = keys.c data.c metadata.c atom.c crypt.c +crypt_la_LIBADD = $(top_builddir)/libglusterfs/src/libglusterfs.la + +noinst_HEADERS = crypt-common.h crypt-mem-types.h crypt.h metadata.h + +AM_CPPFLAGS = $(GF_CPPFLAGS) -I$(top_srcdir)/libglusterfs/src + +AM_CFLAGS = -Wall $(GF_CFLAGS) + +CLEANFILES = + +else + +noinst_DIST = keys.c data.c metadata.c atom.c crypt.c +noinst_HEADERS = crypt-common.h crypt-mem-types.h crypt.h metadata.h + +endif
\ No newline at end of file diff --git a/xlators/encryption/crypt/src/atom.c b/xlators/encryption/crypt/src/atom.c new file mode 100644 index 00000000000..1ec41495ca1 --- /dev/null +++ b/xlators/encryption/crypt/src/atom.c @@ -0,0 +1,962 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif + +#include "defaults.h" +#include "crypt-common.h" +#include "crypt.h" + +/* + * Glossary + * + * + * cblock (or cipher block). A logical unit in a file. + * cblock size is defined as the number of bits + * in an input (or output) block of the block + * cipher (*). Cipher block size is a property of + * cipher algorithm. E.g. cblock size is 64 bits + * for DES, 128 bits for AES, etc. + * + * atomic cipher A cipher algorithm, which requires some chunks of + * algorithm text to be padded at left and(or) right sides before + * cipher transaform. + * + * + * block (atom) Minimal chunk of file's data, which doesn't require + * padding. We'll consider logical units in a file of + * block size (atom size). + * + * cipher algorithm Atomic cipher algorithm, which requires the last + * with EOF issue incomplete cblock in a file to be padded with some + * data (usually zeros). + * + * + * operation, which reading/writing from offset, which is not aligned to + * forms a gap at to atom size + * the beginning + * + * + * operation, which reading/writing count bytes starting from offset off, + * forms a gap at so that off+count is not aligned to atom_size + * the end + * + * head block the first atom affected by an operation, which forms + * a gap at the beginning, or(and) at the end. + * Сomment. Head block has at least one gap (either at + * the beginning, or at the end) + * + * + * tail block the last atom different from head, affected by an + * operation, which forms a gap at the end. + * Сomment: Tail block has exactly one gap (at the end). + * + * + * partial block head or tail block + * + * + * full block block without gaps. + * + * + * (*) Recommendation for Block Cipher Modes of Operation + * Methods and Techniques + * NIST Special Publication 800-38A Edition 2001 + */ + +/* + * atom->offset_at() + */ +static off_t offset_at_head(struct avec_config *conf) +{ + return conf->aligned_offset; +} + +static off_t offset_at_hole_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_head(get_hole_conf(frame)); +} + +static off_t offset_at_data_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_head(get_data_conf(frame)); +} + + +static off_t offset_at_tail(struct avec_config *conf, + struct object_cipher_info *object) +{ + return conf->aligned_offset + + (conf->off_in_head ? get_atom_size(object) : 0) + + (conf->nr_full_blocks << get_atom_bits(object)); +} + +static off_t offset_at_hole_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_tail(get_hole_conf(frame), object); +} + + +static off_t offset_at_data_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_tail(get_data_conf(frame), object); +} + +static off_t offset_at_full(struct avec_config *conf, + struct object_cipher_info *object) +{ + return conf->aligned_offset + + (conf->off_in_head ? get_atom_size(object) : 0); +} + +static off_t offset_at_data_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_full(get_data_conf(frame), object); +} + +static off_t offset_at_hole_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_at_full(get_hole_conf(frame), object); +} + +/* + * atom->io_size_nopad() + */ + +static uint32_t io_size_nopad_head(struct avec_config *conf, + struct object_cipher_info *object) +{ + uint32_t gap_at_beg; + uint32_t gap_at_end; + + check_head_block(conf); + + gap_at_beg = conf->off_in_head; + + if (has_tail_block(conf) || has_full_blocks(conf) || conf->off_in_tail == 0 ) + gap_at_end = 0; + else + gap_at_end = get_atom_size(object) - conf->off_in_tail; + + return get_atom_size(object) - (gap_at_beg + gap_at_end); +} + +static uint32_t io_size_nopad_tail(struct avec_config *conf, + struct object_cipher_info *object) +{ + check_tail_block(conf); + return conf->off_in_tail; +} + +static uint32_t io_size_nopad_full(struct avec_config *conf, + struct object_cipher_info *object) +{ + check_full_block(conf); + return get_atom_size(object); +} + +static uint32_t io_size_nopad_data_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_head(get_data_conf(frame), object); +} + +static uint32_t io_size_nopad_hole_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_head(get_hole_conf(frame), object); +} + +static uint32_t io_size_nopad_data_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_tail(get_data_conf(frame), object); +} + +static uint32_t io_size_nopad_hole_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_tail(get_hole_conf(frame), object); +} + +static uint32_t io_size_nopad_data_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_full(get_data_conf(frame), object); +} + +static uint32_t io_size_nopad_hole_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return io_size_nopad_full(get_hole_conf(frame), object); +} + +static uint32_t offset_in_head(struct avec_config *conf) +{ + check_cursor_head(conf); + + return conf->off_in_head; +} + +static uint32_t offset_in_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return 0; +} + +static uint32_t offset_in_full(struct avec_config *conf, + struct object_cipher_info *object) +{ + check_cursor_full(conf); + + if (has_head_block(conf)) + return (conf->cursor - 1) << get_atom_bits(object); + else + return conf->cursor << get_atom_bits(object); +} + +static uint32_t offset_in_data_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_in_head(get_data_conf(frame)); +} + +static uint32_t offset_in_hole_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_in_head(get_hole_conf(frame)); +} + +static uint32_t offset_in_data_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_in_full(get_data_conf(frame), object); +} + +static uint32_t offset_in_hole_full(call_frame_t *frame, + struct object_cipher_info *object) +{ + return offset_in_full(get_hole_conf(frame), object); +} + +/* + * atom->rmw() + */ +/* + * Pre-conditions: + * @vec contains plain text of the latest + * version. + * + * Uptodate gaps of the @partial block with + * this plain text, encrypt the whole block + * and write the result to disk. + */ +static int32_t rmw_partial_block(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + struct rmw_atom *atom) +{ + size_t was_read = 0; + uint64_t file_size; + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + + struct iovec *partial = atom->get_iovec(frame, 0); + struct avec_config *conf = atom->get_config(frame); + end_writeback_handler_t end_writeback_partial_block; +#if DEBUG_CRYPT + gf_boolean_t check_last_cblock = _gf_false; +#endif + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) + goto exit; + + file_size = local->cur_file_size; + was_read = op_ret; + + if (atom->locality == HEAD_ATOM && conf->off_in_head) { + /* + * head atom with a non-uptodate gap + * at the beginning + * + * fill the gap with plain text of the + * latest version. Convert a part of hole + * (if any) to zeros. + */ + int32_t i; + int32_t copied = 0; + int32_t to_gap; /* amount of data needed to uptodate + the gap at the beginning */ +#if 0 + int32_t hole = 0; /* The part of the hole which + * got in the head block */ +#endif /* 0 */ + to_gap = conf->off_in_head; + + if (was_read < to_gap) { + if (file_size > + offset_at_head(conf) + was_read) { + /* + * It is impossible to uptodate + * head block: too few bytes have + * been read from disk, so that + * partial write is impossible. + * + * It could happen because of many + * reasons: IO errors, (meta)data + * corruption in the local file system, + * etc. + */ + gf_log(this->name, GF_LOG_WARNING, + "Can not uptodate a gap at the beginning"); + local->op_ret = -1; + local->op_errno = EIO; + goto exit; + } +#if 0 + hole = to_gap - was_read; +#endif /* 0 */ + to_gap = was_read; + } + /* + * uptodate the gap at the beginning + */ + for (i = 0; i < count && copied < to_gap; i++) { + int32_t to_copy; + + to_copy = vec[i].iov_len; + if (to_copy > to_gap - copied) + to_copy = to_gap - copied; + + memcpy(partial->iov_base, vec[i].iov_base, to_copy); + copied += to_copy; + } +#if 0 + /* + * If possible, convert part of the + * hole, which got in the head block + */ + ret = TRY_LOCK(&local->hole_lock); + if (!ret) { + if (local->hole_handled) + /* + * already converted by + * crypt_writev_cbk() + */ + UNLOCK(&local->hole_lock); + else { + /* + * convert the part of the hole + * which got in the head block + * to zeros. + * + * Update the orig_offset to make + * sure writev_cbk() won't care + * about this part of the hole. + * + */ + memset(partial->iov_base + to_gap, 0, hole); + + conf->orig_offset -= hole; + conf->orig_size += hole; + UNLOCK(&local->hole_lock); + } + } + else /* + * conversion is being performed + * by crypt_writev_cbk() + */ + ; +#endif /* 0 */ + } + if (atom->locality == TAIL_ATOM || + (!has_tail_block(conf) && conf->off_in_tail)) { + /* + * tail atom, or head atom with a non-uptodate + * gap at the end. + * + * fill the gap at the end of the block + * with plain text of the latest version. + * Pad the result, (if needed) + */ + int32_t i; + int32_t to_gap; + int copied; + off_t off_in_tail; + int32_t to_copy; + + off_in_tail = conf->off_in_tail; + to_gap = conf->gap_in_tail; + + if (to_gap && was_read < off_in_tail + to_gap) { + /* + * It is impossible to uptodate + * the gap at the end: too few bytes + * have been read from disk, so that + * partial write is impossible. + * + * It could happen because of many + * reasons: IO errors, (meta)data + * corruption in the local file system, + * etc. + */ + gf_log(this->name, GF_LOG_WARNING, + "Can not uptodate a gap at the end"); + local->op_ret = -1; + local->op_errno = EIO; + goto exit; + } + /* + * uptodate the gap at the end + */ + copied = 0; + to_copy = to_gap; + for(i = count - 1; i >= 0 && to_copy > 0; i--) { + uint32_t from_vec, off_in_vec; + + off_in_vec = 0; + from_vec = vec[i].iov_len; + if (from_vec > to_copy) { + off_in_vec = from_vec - to_copy; + from_vec = to_copy; + } + memcpy(partial->iov_base + + off_in_tail + to_gap - copied - from_vec, + vec[i].iov_base + off_in_vec, + from_vec); + + gf_log(this->name, GF_LOG_DEBUG, + "uptodate %d bytes at tail. Offset at target(source): %d(%d)", + (int)from_vec, + (int)off_in_tail + to_gap - copied - from_vec, + (int)off_in_vec); + + copied += from_vec; + to_copy -= from_vec; + } + partial->iov_len = off_in_tail + to_gap; + + if (object_alg_should_pad(object)) { + int32_t resid = 0; + resid = partial->iov_len & (object_alg_blksize(object) - 1); + if (resid) { + /* + * append a new EOF padding + */ + local->eof_padding_size = + object_alg_blksize(object) - resid; + + gf_log(this->name, GF_LOG_DEBUG, + "set padding size %d", + local->eof_padding_size); + + memset(partial->iov_base + partial->iov_len, + 1, + local->eof_padding_size); + partial->iov_len += local->eof_padding_size; +#if DEBUG_CRYPT + gf_log(this->name, GF_LOG_DEBUG, + "pad cblock with %d zeros:", + local->eof_padding_size); + dump_cblock(this, + (unsigned char *)partial->iov_base + + partial->iov_len - object_alg_blksize(object)); + check_last_cblock = _gf_true; +#endif + } + } + } + /* + * encrypt the whole block + */ + encrypt_aligned_iov(object, + partial, + 1, + atom->offset_at(frame, object)); +#if DEBUG_CRYPT + if (check_last_cblock == _gf_true) { + gf_log(this->name, GF_LOG_DEBUG, + "encrypt last cblock with offset %llu", + (unsigned long long)atom->offset_at(frame, object)); + dump_cblock(this, (unsigned char *)partial->iov_base + + partial->iov_len - object_alg_blksize(object)); + } +#endif + set_local_io_params_writev(frame, object, atom, + atom->offset_at(frame, object), + iovec_get_size(partial, 1)); + /* + * write the whole block to disk + */ + end_writeback_partial_block = dispatch_end_writeback(local->fop); + conf->cursor ++; + STACK_WIND(frame, + end_writeback_partial_block, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->writev, + local->fd, + partial, + 1, + atom->offset_at(frame, object), + local->flags, + local->iobref_data, + local->xdata); + + gf_log("crypt", GF_LOG_DEBUG, + "submit partial block: %d bytes from %d offset", + (int)iovec_get_size(partial, 1), + (int)atom->offset_at(frame, object)); + exit: + return 0; +} + +/* + * Perform a (read-)modify-write sequence. + * This should be performed only after approval + * of upper server-side manager, i.e. the caller + * needs to make sure this is his turn to rmw. + */ +void submit_partial(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + atom_locality_type ltype) +{ + int32_t ret; + dict_t *dict; + struct rmw_atom *atom; + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + + atom = atom_by_types(local->active_setup, ltype); + /* + * To perform the "read" component of the read-modify-write + * sequence the crypt translator does stack_wind to itself. + * + * Pass current file size to crypt_readv() + */ + dict = dict_new(); + if (!dict) { + /* + * FIXME: Handle the error + */ + gf_log("crypt", GF_LOG_WARNING, "Can not alloc dict"); + return; + } + ret = dict_set(dict, + FSIZE_XATTR_PREFIX, + data_from_uint64(local->cur_file_size)); + if (ret) { + /* + * FIXME: Handle the error + */ + dict_unref(dict); + gf_log("crypt", GF_LOG_WARNING, "Can not set dict"); + goto exit; + } + STACK_WIND(frame, + atom->rmw, + this, + this->fops->readv, /* crypt_readv */ + fd, + atom->count_to_uptodate(frame, object), /* count */ + atom->offset_at(frame, object), /* offset to read from */ + 0, + dict); + exit: + dict_unref(dict); +} + +/* + * submit blocks of FULL_ATOM type + */ +void submit_full(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + struct rmw_atom *atom = atom_by_types(local->active_setup, FULL_ATOM); + uint32_t count; /* total number of full blocks to submit */ + uint32_t granularity; /* number of blocks to submit in one iteration */ + + uint64_t off_in_file; /* start offset in the file, bytes */ + uint32_t off_in_atom; /* start offset in the atom, blocks */ + uint32_t blocks_written = 0; /* blocks written for this submit */ + + struct avec_config *conf = atom->get_config(frame); + end_writeback_handler_t end_writeback_full_block; + /* + * Write full blocks by groups of granularity size. + */ + end_writeback_full_block = dispatch_end_writeback(local->fop); + + if (is_ordered_mode(frame)) { + uint32_t skip = has_head_block(conf) ? 1 : 0; + count = 1; + granularity = 1; + /* + * calculate start offset using cursor value; + * here we should take into accout head block, + * which corresponds to cursor value 0. + */ + off_in_file = atom->offset_at(frame, object) + + ((conf->cursor - skip) << get_atom_bits(object)); + off_in_atom = conf->cursor - skip; + } + else { + /* + * in parallel mode + */ + count = conf->nr_full_blocks; + granularity = MAX_IOVEC; + off_in_file = atom->offset_at(frame, object); + off_in_atom = 0; + } + while (count) { + uint32_t blocks_to_write = count; + + if (blocks_to_write > granularity) + blocks_to_write = granularity; + if (conf->type == HOLE_ATOM) + /* + * reset iovec before encryption + */ + memset(atom->get_iovec(frame, 0)->iov_base, + 0, + get_atom_size(object)); + /* + * encrypt the group + */ + encrypt_aligned_iov(object, + atom->get_iovec(frame, + off_in_atom + + blocks_written), + blocks_to_write, + off_in_file + (blocks_written << + get_atom_bits(object))); + + set_local_io_params_writev(frame, object, atom, + off_in_file + (blocks_written << get_atom_bits(object)), + blocks_to_write << get_atom_bits(object)); + + conf->cursor += blocks_to_write; + + STACK_WIND(frame, + end_writeback_full_block, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->writev, + local->fd, + atom->get_iovec(frame, off_in_atom + blocks_written), + blocks_to_write, + off_in_file + (blocks_written << get_atom_bits(object)), + local->flags, + local->iobref_data ? local->iobref_data : local->iobref, + local->xdata); + + gf_log("crypt", GF_LOG_DEBUG, "submit %d full blocks from %d offset", + blocks_to_write, + (int)(off_in_file + (blocks_written << get_atom_bits(object)))); + + count -= blocks_to_write; + blocks_written += blocks_to_write; + } + return; +} + +static int32_t rmw_data_head(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + return rmw_partial_block(frame, + cookie, + this, + op_ret, + op_errno, + vec, + count, + stbuf, + iobref, + atom_by_types(DATA_ATOM, HEAD_ATOM)); +} + +static int32_t rmw_data_tail(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + return rmw_partial_block(frame, + cookie, + this, + op_ret, + op_errno, + vec, + count, + stbuf, + iobref, + atom_by_types(DATA_ATOM, TAIL_ATOM)); +} + +static int32_t rmw_hole_head(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + return rmw_partial_block(frame, + cookie, + this, + op_ret, + op_errno, + vec, + count, + stbuf, + iobref, + atom_by_types(HOLE_ATOM, HEAD_ATOM)); +} + +static int32_t rmw_hole_tail(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + return rmw_partial_block(frame, + cookie, + this, + op_ret, + op_errno, + vec, + count, + stbuf, + iobref, + atom_by_types(HOLE_ATOM, TAIL_ATOM)); +} + +/* + * atom->count_to_uptodate() + */ +static uint32_t count_to_uptodate_head(struct avec_config *conf, + struct object_cipher_info *object) +{ + if (conf->acount == 1 && conf->off_in_tail) + return get_atom_size(object); + else + /* there is no need to read the whole head block */ + return conf->off_in_head; +} + +static uint32_t count_to_uptodate_tail(struct avec_config *conf, + struct object_cipher_info *object) +{ + /* we need to read the whole tail block */ + return get_atom_size(object); +} + +static uint32_t count_to_uptodate_data_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return count_to_uptodate_head(get_data_conf(frame), object); +} + +static uint32_t count_to_uptodate_data_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return count_to_uptodate_tail(get_data_conf(frame), object); +} + +static uint32_t count_to_uptodate_hole_head(call_frame_t *frame, + struct object_cipher_info *object) +{ + return count_to_uptodate_head(get_hole_conf(frame), object); +} + +static uint32_t count_to_uptodate_hole_tail(call_frame_t *frame, + struct object_cipher_info *object) +{ + return count_to_uptodate_tail(get_hole_conf(frame), object); +} + +/* atom->get_config() */ + +static struct avec_config *get_config_data(call_frame_t *frame) +{ + return &((crypt_local_t *)frame->local)->data_conf; +} + +static struct avec_config *get_config_hole(call_frame_t *frame) +{ + return &((crypt_local_t *)frame->local)->hole_conf; +} + +/* + * atom->get_iovec() + */ +static struct iovec *get_iovec_hole_head(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_hole_conf(frame); + + return conf->avec; +} + +static struct iovec *get_iovec_hole_full(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_hole_conf(frame); + + return conf->avec + (conf->off_in_head ? 1 : 0); +} + +static inline struct iovec *get_iovec_hole_tail(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_hole_conf(frame); + + return conf->avec + (conf->blocks_in_pool - 1); +} + +static inline struct iovec *get_iovec_data_head(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_data_conf(frame); + + return conf->avec; +} + +static inline struct iovec *get_iovec_data_full(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_data_conf(frame); + + return conf->avec + (conf->off_in_head ? 1 : 0) + count; +} + +static inline struct iovec *get_iovec_data_tail(call_frame_t *frame, + uint32_t count) +{ + struct avec_config *conf = get_data_conf(frame); + + return conf->avec + + (conf->off_in_head ? 1 : 0) + + conf->nr_full_blocks; +} + +static struct rmw_atom atoms[LAST_DATA_TYPE][LAST_LOCALITY_TYPE] = { + [DATA_ATOM][HEAD_ATOM] = + { .locality = HEAD_ATOM, + .rmw = rmw_data_head, + .offset_at = offset_at_data_head, + .offset_in = offset_in_data_head, + .get_iovec = get_iovec_data_head, + .io_size_nopad = io_size_nopad_data_head, + .count_to_uptodate = count_to_uptodate_data_head, + .get_config = get_config_data + }, + [DATA_ATOM][TAIL_ATOM] = + { .locality = TAIL_ATOM, + .rmw = rmw_data_tail, + .offset_at = offset_at_data_tail, + .offset_in = offset_in_tail, + .get_iovec = get_iovec_data_tail, + .io_size_nopad = io_size_nopad_data_tail, + .count_to_uptodate = count_to_uptodate_data_tail, + .get_config = get_config_data + }, + [DATA_ATOM][FULL_ATOM] = + { .locality = FULL_ATOM, + .offset_at = offset_at_data_full, + .offset_in = offset_in_data_full, + .get_iovec = get_iovec_data_full, + .io_size_nopad = io_size_nopad_data_full, + .get_config = get_config_data + }, + [HOLE_ATOM][HEAD_ATOM] = + { .locality = HEAD_ATOM, + .rmw = rmw_hole_head, + .offset_at = offset_at_hole_head, + .offset_in = offset_in_hole_head, + .get_iovec = get_iovec_hole_head, + .io_size_nopad = io_size_nopad_hole_head, + .count_to_uptodate = count_to_uptodate_hole_head, + .get_config = get_config_hole + }, + [HOLE_ATOM][TAIL_ATOM] = + { .locality = TAIL_ATOM, + .rmw = rmw_hole_tail, + .offset_at = offset_at_hole_tail, + .offset_in = offset_in_tail, + .get_iovec = get_iovec_hole_tail, + .io_size_nopad = io_size_nopad_hole_tail, + .count_to_uptodate = count_to_uptodate_hole_tail, + .get_config = get_config_hole + }, + [HOLE_ATOM][FULL_ATOM] = + { .locality = FULL_ATOM, + .offset_at = offset_at_hole_full, + .offset_in = offset_in_hole_full, + .get_iovec = get_iovec_hole_full, + .io_size_nopad = io_size_nopad_hole_full, + .get_config = get_config_hole + } +}; + +struct rmw_atom *atom_by_types(atom_data_type data, + atom_locality_type locality) +{ + return &atoms[data][locality]; +} + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/crypt-common.h b/xlators/encryption/crypt/src/crypt-common.h new file mode 100644 index 00000000000..7c212ad5d25 --- /dev/null +++ b/xlators/encryption/crypt/src/crypt-common.h @@ -0,0 +1,141 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef __CRYPT_COMMON_H__ +#define __CRYPT_COMMON_H__ + +#define INVAL_SUBVERSION_NUMBER (0xff) +#define CRYPT_INVAL_OP (GF_FOP_NULL) + +#define CRYPTO_FORMAT_PREFIX "trusted.glusterfs.crypt.att.cfmt" +#define FSIZE_XATTR_PREFIX "trusted.glusterfs.crypt.att.size" +#define SUBREQ_PREFIX "trusted.glusterfs.crypt.msg.sreq" +#define FSIZE_MSG_PREFIX "trusted.glusterfs.crypt.msg.size" +#define DE_MSG_PREFIX "trusted.glusterfs.crypt.msg.dent" +#define REQUEST_ID_PREFIX "trusted.glusterfs.crypt.msg.rqid" +#define MSGFLAGS_PREFIX "trusted.glusterfs.crypt.msg.xfgs" + + +/* messages for crypt_open() */ +#define MSGFLAGS_REQUEST_MTD_RLOCK 1 /* take read lock and don't unlock */ +#define MSGFLAGS_REQUEST_MTD_WLOCK 2 /* take write lock and don't unlock */ + +#define AES_BLOCK_BITS (4) /* AES_BLOCK_SIZE == 1 << AES_BLOCK_BITS */ + +#define noop do {; } while (0) +#define cassert(cond) ({ switch (-1) { case (cond): case 0: break; } }) +#define __round_mask(x, y) ((__typeof__(x))((y)-1)) +#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) + +/* + * Format of file's metadata + */ +struct crypt_format { + uint8_t loader_id; /* version of metadata loader */ + uint8_t versioned[0]; /* file's metadata of specific version */ +} __attribute__((packed)); + +typedef enum { + AES_CIPHER_ALG, + LAST_CIPHER_ALG +} cipher_alg_t; + +typedef enum { + XTS_CIPHER_MODE, + LAST_CIPHER_MODE +} cipher_mode_t; + +typedef enum { + MTD_LOADER_V1, + LAST_MTD_LOADER +} mtd_loader_id; + +static inline void msgflags_set_mtd_rlock(uint32_t *flags) +{ + *flags |= MSGFLAGS_REQUEST_MTD_RLOCK; +} + +static inline void msgflags_set_mtd_wlock(uint32_t *flags) +{ + *flags |= MSGFLAGS_REQUEST_MTD_WLOCK; +} + +static inline gf_boolean_t msgflags_check_mtd_rlock(uint32_t *flags) +{ + return *flags & MSGFLAGS_REQUEST_MTD_RLOCK; +} + +static inline gf_boolean_t msgflags_check_mtd_wlock(uint32_t *flags) +{ + return *flags & MSGFLAGS_REQUEST_MTD_WLOCK; +} + +static inline gf_boolean_t msgflags_check_mtd_lock(uint32_t *flags) +{ + return msgflags_check_mtd_rlock(flags) || + msgflags_check_mtd_wlock(flags); +} + +/* + * returns number of logical blocks occupied + * (maybe partially) by @count bytes + * at offset @start. + */ +static inline off_t logical_blocks_occupied(uint64_t start, off_t count, + int blkbits) +{ + return ((start + count - 1) >> blkbits) - (start >> blkbits) + 1; +} + +/* + * are two bytes (represented by offsets @off1 + * and @off2 respectively) in the same logical + * block. + */ +static inline int in_same_lblock(uint64_t off1, uint64_t off2, + int blkbits) +{ + return off1 >> blkbits == off2 >> blkbits; +} + +static inline void dump_cblock(xlator_t *this, unsigned char *buf) +{ + gf_log(this->name, GF_LOG_DEBUG, + "dump cblock: %x %x %x %x %x %x %x %x %x %x %x %x %x %x %x %x", + (buf)[0], + (buf)[1], + (buf)[2], + (buf)[3], + (buf)[4], + (buf)[5], + (buf)[6], + (buf)[7], + (buf)[8], + (buf)[9], + (buf)[10], + (buf)[11], + (buf)[12], + (buf)[13], + (buf)[14], + (buf)[15]); +} + +#endif /* __CRYPT_COMMON_H__ */ + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h new file mode 100644 index 00000000000..799727573c3 --- /dev/null +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -0,0 +1,43 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + + +#ifndef __CRYPT_MEM_TYPES_H__ +#define __CRYPT_MEM_TYPES_H__ + +#include "mem-types.h" + +enum gf_crypt_mem_types_ { + gf_crypt_mt_priv = gf_common_mt_end + 1, + gf_crypt_mt_inode, + gf_crypt_mt_data, + gf_crypt_mt_mtd, + gf_crypt_mt_loc, + gf_crypt_mt_iatt, + gf_crypt_mt_key, + gf_crypt_mt_iovec, + gf_crypt_mt_char, +}; + +#endif /* __CRYPT_MEM_TYPES_H__ */ + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ + + + diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c new file mode 100644 index 00000000000..db2e6d83cf5 --- /dev/null +++ b/xlators/encryption/crypt/src/crypt.c @@ -0,0 +1,4498 @@ +/* + Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ +#include <ctype.h> +#include <sys/uio.h> + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif + +#include "glusterfs.h" +#include "xlator.h" +#include "logging.h" +#include "defaults.h" + +#include "crypt-common.h" +#include "crypt.h" + +static void init_inode_info_head(struct crypt_inode_info *info, fd_t *fd); +static int32_t init_inode_info_tail(struct crypt_inode_info *info, + struct master_cipher_info *master); +static int32_t prepare_for_submit_hole(call_frame_t *frame, xlator_t *this, + uint64_t from, off_t size); +static int32_t load_file_size(call_frame_t *frame, void *cookie, + xlator_t *this, int32_t op_ret, int32_t op_errno, + dict_t *dict, dict_t *xdata); +static void do_ordered_submit(call_frame_t *frame, xlator_t *this, + atom_data_type dtype); +static void do_parallel_submit(call_frame_t *frame, xlator_t *this, + atom_data_type dtype); +static void put_one_call_open(call_frame_t *frame); +static void put_one_call_readv(call_frame_t *frame, xlator_t *this); +static void put_one_call_writev(call_frame_t *frame, xlator_t *this); +static void put_one_call_ftruncate(call_frame_t *frame, xlator_t *this); +static void free_avec(struct iovec *avec, char **pool, int blocks_in_pool); +static void free_avec_data(crypt_local_t *local); +static void free_avec_hole(crypt_local_t *local); + +static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, + glusterfs_fop_t fop) +{ + crypt_local_t *local = NULL; + + local = mem_get0(this->local_pool); + if (!local) { + gf_log(this->name, GF_LOG_ERROR, "out of memory"); + return NULL; + } + local->fop = fop; + LOCK_INIT(&local->hole_lock); + LOCK_INIT(&local->call_lock); + LOCK_INIT(&local->rw_count_lock); + + frame->local = local; + return local; +} + +struct crypt_inode_info *get_crypt_inode_info(inode_t *inode, xlator_t *this) +{ + int ret; + uint64_t value = 0; + struct crypt_inode_info *info; + + ret = inode_ctx_get(inode, this, &value); + if (ret == -1) { + gf_log (this->name, GF_LOG_WARNING, + "Can not get inode info"); + return NULL; + } + info = (struct crypt_inode_info *)(long)value; + if (info == NULL) { + gf_log (this->name, GF_LOG_WARNING, + "Can not obtain inode info"); + return NULL; + } + return info; +} + +static struct crypt_inode_info *local_get_inode_info(crypt_local_t *local, + xlator_t *this) +{ + if (local->info) + return local->info; + local->info = get_crypt_inode_info(local->fd->inode, this); + return local->info; +} + +static struct crypt_inode_info *alloc_inode_info(crypt_local_t *local, + loc_t *loc) +{ + struct crypt_inode_info *info; + + info = GF_CALLOC(1, sizeof(*info), gf_crypt_mt_inode); + if (!info) { + local->op_ret = -1; + local->op_errno = ENOMEM; + gf_log ("crypt", GF_LOG_WARNING, + "Can not allocate inode info"); + return NULL; + } + memset(info, 0, sizeof(*info)); +#if DEBUG_CRYPT + info->loc = GF_CALLOC(1, sizeof(*loc), gf_crypt_mt_loc); + if (!info->loc) { + gf_log("crypt", GF_LOG_WARNING, "Can not allocate loc"); + GF_FREE(info); + return NULL; + } + if (loc_copy(info->loc, loc)){ + GF_FREE(info->loc); + GF_FREE(info); + return NULL; + } +#endif /* DEBUG_CRYPT */ + + local->info = info; + return info; +} + +static void free_inode_info(struct crypt_inode_info *info) +{ +#if DEBUG_CRYPT + loc_wipe(info->loc); + GF_FREE(info->loc); +#endif + memset(info, 0, sizeof(*info)); + GF_FREE(info); +} + +int crypt_forget (xlator_t *this, inode_t *inode) +{ + uint64_t ctx_addr = 0; + if (!inode_ctx_del (inode, this, &ctx_addr)) + free_inode_info((struct crypt_inode_info *)(long)ctx_addr); + return 0; +} + +#if DEBUG_CRYPT +static void check_read(call_frame_t *frame, xlator_t *this, int32_t read, + struct iovec *vec, int32_t count, struct iatt *stbuf) +{ + crypt_local_t *local = frame->local; + struct object_cipher_info *object = get_object_cinfo(local->info); + struct avec_config *conf = &local->data_conf; + uint32_t resid = stbuf->ia_size & (object_alg_blksize(object) - 1); + + if (read <= 0) + return; + if (read != iovec_get_size(vec, count)) + gf_log ("crypt", GF_LOG_DEBUG, + "op_ret differs from amount of read bytes"); + + if (object_alg_should_pad(object) && (read & (object_alg_blksize(object) - 1))) + gf_log ("crypt", GF_LOG_DEBUG, + "bad amount of read bytes (!= 0 mod(cblock size))"); + + if (conf->aligned_offset + read > + stbuf->ia_size + (resid ? object_alg_blksize(object) - resid : 0)) + gf_log ("crypt", GF_LOG_DEBUG, + "bad amount of read bytes (too large))"); + +} + +#define PT_BYTES_TO_DUMP (32) +static void dump_plain_text(crypt_local_t *local, struct iovec *avec) +{ + int32_t to_dump; + char str[PT_BYTES_TO_DUMP + 1]; + + if (!avec) + return; + to_dump = avec->iov_len; + if (to_dump > PT_BYTES_TO_DUMP) + to_dump = PT_BYTES_TO_DUMP; + memcpy(str, avec->iov_base, to_dump); + memset(str + to_dump, '0', 1); + gf_log("crypt", GF_LOG_DEBUG, "Read file: %s", str); +} + +static int32_t data_conf_invariant(struct avec_config *conf) +{ + return conf->acount == + !!has_head_block(conf) + + !!has_tail_block(conf)+ + conf->nr_full_blocks; +} + +static int32_t hole_conf_invariant(struct avec_config *conf) +{ + return conf->blocks_in_pool == + !!has_head_block(conf) + + !!has_tail_block(conf)+ + !!has_full_blocks(conf); +} + +static void crypt_check_conf(struct avec_config *conf) +{ + int32_t ret = 0; + const char *msg; + + switch (conf->type) { + case DATA_ATOM: + msg = "data"; + ret = data_conf_invariant(conf); + break; + case HOLE_ATOM: + msg = "hole"; + ret = hole_conf_invariant(conf); + break; + default: + msg = "unknown"; + } + if (!ret) + gf_log("crypt", GF_LOG_DEBUG, "bad %s conf", msg); +} + +static void check_buf(call_frame_t *frame, xlator_t *this, struct iatt *buf) +{ + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + uint64_t local_file_size; + + switch(local->fop) { + case GF_FOP_FTRUNCATE: + return; + case GF_FOP_WRITE: + local_file_size = local->new_file_size; + break; + case GF_FOP_READ: + if (parent_is_crypt_xlator(frame, this)) + return; + local_file_size = local->cur_file_size; + break; + default: + gf_log("crypt", GF_LOG_DEBUG, "bad file operation"); + return; + } + if (buf->ia_size != round_up(local_file_size, + object_alg_blksize(object))) + gf_log("crypt", GF_LOG_DEBUG, + "bad ia_size in buf (%llu), should be %llu", + (unsigned long long)buf->ia_size, + (unsigned long long)round_up(local_file_size, + object_alg_blksize(object))); +} + +#else +#define check_read(frame, this, op_ret, vec, count, stbuf) noop +#define dump_plain_text(local, avec) noop +#define crypt_check_conf(conf) noop +#define check_buf(frame, this, buf) noop +#endif /* DEBUG_CRYPT */ + +/* + * Pre-conditions: + * @vec represents a ciphertext of expanded size and + * aligned offset. + * + * Compound a temporal vector @avec with block-aligned + * components, decrypt and fix it up to represent a chunk + * of data corresponding to the original size and offset. + * Pass the result to the next translator. + */ +int32_t crypt_readv_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + struct object_cipher_info *object = &local->info->cinfo; + + struct iovec *avec; + uint32_t i; + uint32_t to_vec; + uint32_t to_user; + + check_buf(frame, this, stbuf); + check_read(frame, this, op_ret, vec, count, stbuf); + + local->op_ret = op_ret; + local->op_errno = op_errno; + local->iobref = iobref_ref(iobref); + + local->buf = *stbuf; + local->buf.ia_size = local->cur_file_size; + + if (op_ret <= 0 || count == 0 || vec[0].iov_len == 0) + goto put_one_call; + + if (conf->orig_offset >= local->cur_file_size) { + local->op_ret = 0; + goto put_one_call; + } + /* + * correct config params with real file size + * and actual amount of bytes read + */ + set_config_offsets(frame, this, + conf->orig_offset, op_ret, DATA_ATOM, 0); + + if (conf->orig_offset + conf->orig_size > local->cur_file_size) + conf->orig_size = local->cur_file_size - conf->orig_offset; + /* + * calculate amount of data to be returned + * to user. + */ + to_user = op_ret; + if (conf->aligned_offset + to_user <= conf->orig_offset) { + gf_log(this->name, GF_LOG_WARNING, "Incomplete read"); + local->op_ret = -1; + local->op_errno = EIO; + goto put_one_call; + } + to_user -= (conf->aligned_offset - conf->orig_offset); + + if (to_user > conf->orig_size) + to_user = conf->orig_size; + local->rw_count = to_user; + + op_errno = set_config_avec_data(this, local, + conf, object, vec, count); + if (op_errno) { + local->op_ret = -1; + local->op_errno = op_errno; + goto put_one_call; + } + avec = conf->avec; +#if DEBUG_CRYPT + if (conf->off_in_tail != 0 && + conf->off_in_tail < object_alg_blksize(object) && + object_alg_should_pad(object)) + gf_log(this->name, GF_LOG_DEBUG, "Bad offset in tail %d", + conf->off_in_tail); + if (iovec_get_size(vec, count) != 0 && + in_same_lblock(conf->orig_offset + iovec_get_size(vec, count) - 1, + local->cur_file_size - 1, + object_alg_blkbits(object))) { + gf_log(this->name, GF_LOG_DEBUG, "Compound last cblock"); + dump_cblock(this, + (unsigned char *)(avec[conf->acount - 1].iov_base) + + avec[conf->acount - 1].iov_len - object_alg_blksize(object)); + dump_cblock(this, + (unsigned char *)(vec[count - 1].iov_base) + + vec[count - 1].iov_len - object_alg_blksize(object)); + } +#endif + decrypt_aligned_iov(object, avec, + conf->acount, conf->aligned_offset); + /* + * pass proper plain data to user + */ + avec[0].iov_base += (conf->aligned_offset - conf->orig_offset); + avec[0].iov_len -= (conf->aligned_offset - conf->orig_offset); + + to_vec = to_user; + for (i = 0; i < conf->acount; i++) { + if (avec[i].iov_len > to_vec) + avec[i].iov_len = to_vec; + to_vec -= avec[i].iov_len; + } + put_one_call: + put_one_call_readv(frame, this); + return 0; +} + +static int32_t do_readv(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *dict, + dict_t *xdata) +{ + data_t *data; + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto error; + /* + * extract regular file size + */ + data = dict_get(dict, FSIZE_XATTR_PREFIX); + if (!data) { + gf_log("crypt", GF_LOG_WARNING, "Regular file size not found"); + op_errno = EIO; + goto error; + } + local->cur_file_size = data_to_uint64(data); + + get_one_call(frame); + STACK_WIND(frame, + crypt_readv_cbk, + FIRST_CHILD (this), + FIRST_CHILD (this)->fops->readv, + local->fd, + /* + * FIXME: read amount can be reduced + */ + local->data_conf.expanded_size, + local->data_conf.aligned_offset, + local->flags, + local->xdata); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + + get_one_call(frame); + put_one_call_readv(frame, this); + return 0; +} + +static int32_t crypt_readv_finodelk_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto error; + /* + * An access has been granted, + * retrieve file size + */ + STACK_WIND(frame, + do_readv, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fgetxattr, + local->fd, + FSIZE_XATTR_PREFIX, + NULL); + return 0; + error: + fd_unref(local->fd); + if (local->xdata) + dict_unref(local->xdata); + STACK_UNWIND_STRICT(readv, + frame, + -1, + op_errno, + NULL, + 0, + NULL, + NULL, + NULL); + return 0; +} + +static int32_t readv_trivial_completion(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *buf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) { + gf_log(this->name, GF_LOG_WARNING, + "stat failed (%d)", op_errno); + goto error; + } + local->buf = *buf; + STACK_WIND(frame, + load_file_size, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->getxattr, + local->loc, + FSIZE_XATTR_PREFIX, + NULL); + return 0; + error: + STACK_UNWIND_STRICT(readv, frame, op_ret, op_errno, + NULL, 0, NULL, NULL, NULL); + return 0; +} + +int32_t crypt_readv(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + size_t size, + off_t offset, + uint32_t flags, dict_t *xdata) +{ + int32_t ret; + crypt_local_t *local; + struct crypt_inode_info *info; + struct gf_flock lock = {0, }; + +#if DEBUG_CRYPT + gf_log("crypt", GF_LOG_DEBUG, "reading %d bytes from offset %llu", + (int)size, (long long)offset); + if (parent_is_crypt_xlator(frame, this)) + gf_log("crypt", GF_LOG_DEBUG, "parent is crypt"); +#endif + local = crypt_alloc_local(frame, this, GF_FOP_READ); + if (!local) { + ret = ENOMEM; + goto error; + } + if (size == 0) + goto trivial; + + local->fd = fd_ref(fd); + local->flags = flags; + + info = local_get_inode_info(local, this); + if (info == NULL) { + ret = EINVAL; + fd_unref(fd); + goto error; + } + if (!object_alg_atomic(&info->cinfo)) { + ret = EINVAL; + fd_unref(fd); + goto error; + } + set_config_offsets(frame, this, offset, size, + DATA_ATOM, 0); + if (parent_is_crypt_xlator(frame, this)) { + data_t *data; + /* + * We are called by crypt_writev (or cypt_ftruncate) + * to perform the "read" component of the read-modify-write + * (or read-prune-write) sequence for some atom; + * + * don't ask for access: + * it has already been acquired + * + * Retrieve current file size + */ + if (!xdata) { + gf_log("crypt", GF_LOG_WARNING, + "Regular file size hasn't been passed"); + ret = EIO; + goto error; + } + data = dict_get(xdata, FSIZE_XATTR_PREFIX); + if (!data) { + gf_log("crypt", GF_LOG_WARNING, + "Regular file size not found"); + ret = EIO; + goto error; + } + local->old_file_size = + local->cur_file_size = data_to_uint64(data); + + get_one_call(frame); + STACK_WIND(frame, + crypt_readv_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->readv, + local->fd, + /* + * FIXME: read amount can be reduced + */ + local->data_conf.expanded_size, + local->data_conf.aligned_offset, + flags, + NULL); + return 0; + } + if (xdata) + local->xdata = dict_ref(xdata); + + lock.l_len = 0; + lock.l_start = 0; + lock.l_type = F_RDLCK; + lock.l_whence = SEEK_SET; + + STACK_WIND(frame, + crypt_readv_finodelk_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + fd, + F_SETLKW, + &lock, + NULL); + return 0; + trivial: + STACK_WIND(frame, + readv_trivial_completion, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fstat, + fd, + NULL); + return 0; + error: + STACK_UNWIND_STRICT(readv, + frame, + -1, + ret, + NULL, + 0, + NULL, + NULL, + NULL); + return 0; +} + +void set_local_io_params_writev(call_frame_t *frame, + struct object_cipher_info *object, + struct rmw_atom *atom, + off_t io_offset, + uint32_t io_size) +{ + crypt_local_t *local = frame->local; + + local->io_offset = io_offset; + local->io_size = io_size; + + local->io_offset_nopad = + atom->offset_at(frame, object) + atom->offset_in(frame, object); + + gf_log("crypt", GF_LOG_DEBUG, + "set nopad offset to %llu", + (unsigned long long)local->io_offset_nopad); + + local->io_size_nopad = atom->io_size_nopad(frame, object); + + gf_log("crypt", GF_LOG_DEBUG, + "set nopad size to %llu", + (unsigned long long)local->io_size_nopad); + + local->update_disk_file_size = 0; + /* + * NOTE: eof_padding_size is 0 for all full atoms; + * For head and tail atoms it will be set up at rmw_partial block() + */ + local->new_file_size = local->cur_file_size; + + if (local->io_offset_nopad + local->io_size_nopad > local->cur_file_size) { + + local->new_file_size = local->io_offset_nopad + local->io_size_nopad; + + gf_log("crypt", GF_LOG_DEBUG, + "set new file size to %llu", + (unsigned long long)local->new_file_size); + + local->update_disk_file_size = 1; + } +} + +void set_local_io_params_ftruncate(call_frame_t *frame, + struct object_cipher_info *object) +{ + uint32_t resid; + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + + resid = conf->orig_offset & (object_alg_blksize(object) - 1); + if (resid) { + local->eof_padding_size = + object_alg_blksize(object) - resid; + local->new_file_size = conf->aligned_offset; + local->update_disk_file_size = 0; + /* + * file size will be updated + * in the ->writev() stack, + * when submitting file tail + */ + } + else { + local->eof_padding_size = 0; + local->new_file_size = conf->orig_offset; + local->update_disk_file_size = 1; + /* + * file size will be updated + * in this ->ftruncate stack + */ + } +} + +static inline void submit_head(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + submit_partial(frame, this, local->fd, HEAD_ATOM); +} + +static inline void submit_tail(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + submit_partial(frame, this, local->fd, TAIL_ATOM); +} + +static void submit_hole(call_frame_t *frame, xlator_t *this) +{ + /* + * hole conversion always means + * appended write and goes in ordered fashion + */ + do_ordered_submit(frame, this, HOLE_ATOM); +} + +static void submit_data(call_frame_t *frame, xlator_t *this) +{ + if (is_ordered_mode(frame)) { + do_ordered_submit(frame, this, DATA_ATOM); + return; + } + gf_log("crypt", GF_LOG_WARNING, "Bad submit mode"); + get_nr_calls(frame, nr_calls_data(frame)); + do_parallel_submit(frame, this, DATA_ATOM); + return; +} + +/* + * heplers called by writev_cbk, fruncate_cbk in ordered mode + */ + +static inline int32_t should_submit_hole(crypt_local_t *local) +{ + struct avec_config *conf = &local->hole_conf; + + return conf->avec != NULL; +} + +static inline int32_t should_resume_submit_hole(crypt_local_t *local) +{ + struct avec_config *conf = &local->hole_conf; + + if (local->fop == GF_FOP_WRITE && has_tail_block(conf)) + /* + * Don't submit a part of hole, which + * fits into a data block: + * this part of hole will be converted + * as a gap filled by zeros in data head + * block. + */ + return conf->cursor < conf->acount - 1; + else + return conf->cursor < conf->acount; +} + +static inline int32_t should_resume_submit_data(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + + if (is_ordered_mode(frame)) + return conf->cursor < conf->acount; + /* + * parallel writes + */ + return 0; +} + +static inline int32_t should_submit_data_after_hole(crypt_local_t *local) +{ + return local->data_conf.avec != NULL; +} + +static void update_local_file_params(call_frame_t *frame, + xlator_t *this, + struct iatt *prebuf, + struct iatt *postbuf) +{ + crypt_local_t *local = frame->local; + + check_buf(frame, this, postbuf); + + local->prebuf = *prebuf; + local->postbuf = *postbuf; + + local->prebuf.ia_size = local->cur_file_size; + local->postbuf.ia_size = local->new_file_size; + + local->cur_file_size = local->new_file_size; +} + +static int32_t end_writeback_writev(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret <= 0) { + gf_log(this->name, GF_LOG_WARNING, + "writev iteration failed"); + goto put_one_call; + } + /* + * op_ret includes paddings (atom's head, atom's tail and EOF) + */ + if (op_ret < local->io_size) { + gf_log(this->name, GF_LOG_WARNING, + "Incomplete writev iteration"); + goto put_one_call; + } + op_ret -= local->eof_padding_size; + local->op_ret = op_ret; + + update_local_file_params(frame, this, prebuf, postbuf); + + if (data_write_in_progress(local)) { + + LOCK(&local->rw_count_lock); + local->rw_count += op_ret; + UNLOCK(&local->rw_count_lock); + + if (should_resume_submit_data(frame)) + submit_data(frame, this); + } + else { + /* + * hole conversion is going on; + * don't take into account written zeros + */ + if (should_resume_submit_hole(local)) + submit_hole(frame, this); + + else if (should_submit_data_after_hole(local)) + submit_data(frame, this); + } + put_one_call: + put_one_call_writev(frame, this); + return 0; +} + +#define crypt_writev_cbk end_writeback_writev + +#define HOLE_WRITE_CHUNK_BITS 12 +#define HOLE_WRITE_CHUNK_SIZE (1 << HOLE_WRITE_CHUNK_BITS) + +/* + * Convert hole of size @size at offset @off to + * zeros and prepare respective iovecs for submit. + * The hole lock should be held. + * + * Pre-conditions: + * @local->file_size is set and valid. + */ +int32_t prepare_for_submit_hole(call_frame_t *frame, xlator_t *this, + uint64_t off, off_t size) +{ + int32_t ret; + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + + set_config_offsets(frame, this, off, size, HOLE_ATOM, 1); + + ret = set_config_avec_hole(this, local, + &local->hole_conf, object, local->fop); + crypt_check_conf(&local->hole_conf); + + return ret; +} + +/* + * prepare for submit @count bytes at offset @from + */ +int32_t prepare_for_submit_data(call_frame_t *frame, xlator_t *this, + off_t from, int32_t size, struct iovec *vec, + int32_t vec_count, int32_t setup_gap) +{ + uint32_t ret; + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + + set_config_offsets(frame, this, from, size, + DATA_ATOM, setup_gap); + + ret = set_config_avec_data(this, local, + &local->data_conf, object, vec, vec_count); + crypt_check_conf(&local->data_conf); + + return ret; +} + +static void free_avec(struct iovec *avec, + char **pool, int blocks_in_pool) +{ + if (!avec) + return; + GF_FREE(pool); + GF_FREE(avec); +} + +static void free_avec_data(crypt_local_t *local) +{ + return free_avec(local->data_conf.avec, + local->data_conf.pool, + local->data_conf.blocks_in_pool); +} + +static void free_avec_hole(crypt_local_t *local) +{ + return free_avec(local->hole_conf.avec, + local->hole_conf.pool, + local->hole_conf.blocks_in_pool); +} + + +static void do_parallel_submit(call_frame_t *frame, xlator_t *this, + atom_data_type dtype) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf; + + local->active_setup = dtype; + conf = conf_by_type(frame, dtype); + + if (has_head_block(conf)) + submit_head(frame, this); + + if (has_full_blocks(conf)) + submit_full(frame, this); + + if (has_tail_block(conf)) + submit_tail(frame, this); + return; +} + +static void do_ordered_submit(call_frame_t *frame, xlator_t *this, + atom_data_type dtype) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf; + + local->active_setup = dtype; + conf = conf_by_type(frame, dtype); + + if (should_submit_head_block(conf)) { + get_one_call_nolock(frame); + submit_head(frame, this); + } + else if (should_submit_full_block(conf)) { + get_one_call_nolock(frame); + submit_full(frame, this); + } + else if (should_submit_tail_block(conf)) { + get_one_call_nolock(frame); + submit_tail(frame, this); + } + else + gf_log("crypt", GF_LOG_DEBUG, + "nothing has been submitted in ordered mode"); + return; +} + +static int32_t do_writev(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *dict, + dict_t *xdata) +{ + data_t *data; + crypt_local_t *local = frame->local; + struct object_cipher_info *object = &local->info->cinfo; + /* + * extract regular file size + */ + data = dict_get(dict, FSIZE_XATTR_PREFIX); + if (!data) { + gf_log("crypt", GF_LOG_WARNING, "Regular file size not found"); + op_ret = -1; + op_errno = EIO; + goto error; + } + local->old_file_size = local->cur_file_size = data_to_uint64(data); + + set_gap_at_end(frame, object, &local->data_conf, DATA_ATOM); + + if (local->cur_file_size < local->data_conf.orig_offset) { + /* + * Set up hole config + */ + op_errno = prepare_for_submit_hole(frame, + this, + local->cur_file_size, + local->data_conf.orig_offset - local->cur_file_size); + if (op_errno) { + local->op_ret = -1; + local->op_errno = op_errno; + goto error; + } + } + if (should_submit_hole(local)) + submit_hole(frame, this); + else + submit_data(frame, this); + return 0; + error: + get_one_call_nolock(frame); + put_one_call_writev(frame, this); + return 0; +} + +static int32_t crypt_writev_finodelk_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) + goto error; + /* + * An access has been granted, + * retrieve file size first + */ + STACK_WIND(frame, + do_writev, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fgetxattr, + local->fd, + FSIZE_XATTR_PREFIX, + NULL); + return 0; + error: + get_one_call_nolock(frame); + put_one_call_writev(frame, this); + return 0; +} + +static int32_t writev_trivial_completion(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *buf, + dict_t *dict) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + local->prebuf = *buf; + local->postbuf = *buf; + + local->prebuf.ia_size = local->cur_file_size; + local->postbuf.ia_size = local->cur_file_size; + + get_one_call(frame); + put_one_call_writev(frame, this); + return 0; +} + +int crypt_writev(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + struct iovec *vec, + int32_t count, + off_t offset, + uint32_t flags, + struct iobref *iobref, + dict_t *xdata) +{ + int32_t ret; + crypt_local_t *local; + struct crypt_inode_info *info; + struct gf_flock lock = {0, }; +#if DEBUG_CRYPT + gf_log ("crypt", GF_LOG_DEBUG, "writing %d bytes from offset %llu", + (int)iovec_get_size(vec, count), (long long)offset); +#endif + local = crypt_alloc_local(frame, this, GF_FOP_WRITE); + if (!local) { + ret = ENOMEM; + goto error; + } + local->fd = fd_ref(fd); + + if (iobref) + local->iobref = iobref_ref(iobref); + /* + * to update real file size on the server + */ + local->xattr = dict_new(); + if (!local->xattr) { + ret = ENOMEM; + goto error; + } + local->flags = flags; + + info = local_get_inode_info(local, this); + if (info == NULL) { + ret = EINVAL; + goto error; + } + if (!object_alg_atomic(&info->cinfo)) { + ret = EINVAL; + goto error; + } + if (iovec_get_size(vec, count) == 0) + goto trivial; + + ret = prepare_for_submit_data(frame, this, offset, + iovec_get_size(vec, count), + vec, count, 0 /* don't setup gup + in tail: we don't + know file size yet */); + if (ret) + goto error; + + if (parent_is_crypt_xlator(frame, this)) { + data_t *data; + /* + * we are called by shinking crypt_ftruncate(), + * which doesn't perform hole conversion; + * + * don't ask for access: + * it has already been acquired + */ + + /* + * extract file size + */ + if (!xdata) { + gf_log("crypt", GF_LOG_WARNING, + "Regular file size hasn't been passed"); + ret = EIO; + goto error; + } + data = dict_get(xdata, FSIZE_XATTR_PREFIX); + if (!data) { + gf_log("crypt", GF_LOG_WARNING, + "Regular file size not found"); + ret = EIO; + goto error; + } + local->old_file_size = + local->cur_file_size = data_to_uint64(data); + + submit_data(frame, this); + return 0; + } + if (xdata) + local->xdata = dict_ref(xdata); + /* + * lock the file and retrieve its size + */ + lock.l_len = 0; + lock.l_start = 0; + lock.l_type = F_WRLCK; + lock.l_whence = SEEK_SET; + + STACK_WIND(frame, + crypt_writev_finodelk_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + fd, + F_SETLKW, + &lock, + NULL); + return 0; + trivial: + STACK_WIND(frame, + writev_trivial_completion, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fstat, + fd, + NULL); + return 0; + error: + if (local && local->fd) + fd_unref(fd); + if (local && local->iobref) + iobref_unref(iobref); + if (local && local->xdata) + dict_unref(xdata); + if (local && local->xattr) + dict_unref(local->xattr); + if (local && local->info) + free_inode_info(local->info); + + STACK_UNWIND_STRICT(writev, frame, -1, ret, NULL, NULL, NULL); + return 0; +} + +int32_t prepare_for_prune(call_frame_t *frame, xlator_t *this, uint64_t offset) +{ + set_config_offsets(frame, this, + offset, + 0, /* count */ + DATA_ATOM, + 0 /* since we prune, there is no + gap in tail to uptodate */); + return 0; +} + +/* + * Finish the read-prune-modify sequence + * + * Can be invoked as + * 1) ->ftruncate_cbk() for cblock-aligned, or trivial prune + * 2) ->writev_cbk() for non-cblock-aligned prune + */ + +static int32_t prune_complete(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + update_local_file_params(frame, this, prebuf, postbuf); + + put_one_call_ftruncate(frame, this); + return 0; +} + +/* + * This is called as ->ftruncate_cbk() + * + * Perform the "write" component of the + * read-prune-write sequence. + * + * submuit the rest of the file + */ +static int32_t prune_submit_file_tail(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + dict_t *dict; + + if (op_ret < 0) + goto put_one_call; + + if (local->xdata) { + dict_unref(local->xdata); + local->xdata = NULL; + } + if (xdata) + local->xdata = dict_ref(xdata); + + dict = dict_new(); + if (!dict) { + op_errno = ENOMEM; + goto error; + } + + update_local_file_params(frame, this, prebuf, postbuf); + local->new_file_size = conf->orig_offset; + + /* + * The rest of the file is a partial block and, hence, + * should be written via RMW sequence, so the crypt xlator + * does STACK_WIND to itself. + * + * Pass current file size to crypt_writev() + */ + op_errno = dict_set(dict, + FSIZE_XATTR_PREFIX, + data_from_uint64(local->cur_file_size)); + if (op_errno) { + gf_log("crypt", GF_LOG_WARNING, + "can not set key to update file size"); + dict_unref(dict); + goto error; + } + gf_log("crypt", GF_LOG_DEBUG, + "passing current file size (%llu) to crypt_writev", + (unsigned long long)local->cur_file_size); + /* + * Padding will be filled with + * zeros by rmw_partial_block() + */ + STACK_WIND(frame, + prune_complete, + this, + this->fops->writev, /* crypt_writev */ + local->fd, + &local->vec, + 1, + conf->aligned_offset, /* offset to write from */ + 0, + local->iobref, + dict); + + dict_unref(dict); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + put_one_call: + put_one_call_ftruncate(frame, this); + return 0; +} + +/* + * This is called as a callback of ->writev() invoked in behalf + * of ftruncate(): it can be + * 1) ordered writes issued by hole conversion in the case of + * expanded truncate, or + * 2) an rmw partial data block issued by non-cblock-aligned + * prune. + */ +int32_t end_writeback_ftruncate(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + /* + * if nothing has been written, + * then it must be an error + */ + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) + goto put_one_call; + + update_local_file_params(frame, this, prebuf, postbuf); + + if (data_write_in_progress(local)) + /* case (2) */ + goto put_one_call; + /* case (1) */ + if (should_resume_submit_hole(local)) + submit_hole(frame, this); + /* + * case of hole, when we should't resume + */ + put_one_call: + put_one_call_ftruncate(frame, this); + return 0; +} + +/* + * Perform prune and write components of the + * read-prune-write sequence. + * + * Called as ->readv_cbk() + * + * Pre-conditions: + * @vec contains the latest atom of the file + * (plain text) + */ +static int32_t prune_write(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata) +{ + int32_t i; + size_t to_copy; + size_t copied = 0; + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + + local->op_ret = op_ret; + local->op_errno = op_errno; + if (op_ret == -1) + goto put_one_call; + + /* + * At first, uptodate head block + */ + if (iovec_get_size(vec, count) < conf->off_in_head) { + gf_log(this->name, GF_LOG_WARNING, + "Failed to uptodate head block for prune"); + local->op_ret = -1; + local->op_errno = EIO; + goto put_one_call; + } + local->vec.iov_len = conf->off_in_head; + local->vec.iov_base = GF_CALLOC(1, local->vec.iov_len, + gf_crypt_mt_data); + + if (local->vec.iov_base == NULL) { + local->op_ret = -1; + local->op_errno = ENOMEM; + } + for (i = 0; i < count; i++) { + to_copy = vec[i].iov_len; + if (to_copy > local->vec.iov_len - copied) + to_copy = local->vec.iov_len - copied; + + memcpy((char *)local->vec.iov_base + copied, + vec[i].iov_base, + to_copy); + copied += to_copy; + if (copied == local->vec.iov_len) + break; + } + /* + * perform prune with aligned offset + * (i.e. at this step we prune a bit + * more then it is needed + */ + STACK_WIND(frame, + prune_submit_file_tail, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->ftruncate, + local->fd, + conf->aligned_offset, + local->xdata); + return 0; + put_one_call: + put_one_call_ftruncate(frame, this); + return 0; +} + +/* + * Perform a read-prune-write sequence + */ +int32_t read_prune_write(call_frame_t *frame, xlator_t *this) +{ + int32_t ret = 0; + dict_t *dict = NULL; + crypt_local_t *local = frame->local; + struct avec_config *conf = &local->data_conf; + struct object_cipher_info *object = &local->info->cinfo; + + set_local_io_params_ftruncate(frame, object); + get_one_call_nolock(frame); + + if ((conf->orig_offset & (object_alg_blksize(object) - 1)) == 0) { + /* + * cblock-aligned prune: + * we don't need read and write components, + * just cut file body + */ + gf_log("crypt", GF_LOG_DEBUG, + "prune without RMW (at offset %llu", + (unsigned long long)conf->orig_offset); + + STACK_WIND(frame, + prune_complete, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->ftruncate, + local->fd, + conf->orig_offset, + local->xdata); + return 0; + } + gf_log("crypt", GF_LOG_DEBUG, + "prune with RMW (at offset %llu", + (unsigned long long)conf->orig_offset); + /* + * We are about to perform the "read" component of the + * read-prune-write sequence. It means that we need to + * read encrypted data from disk and decrypt it. + * So, the crypt translator does STACK_WIND to itself. + * + * Pass current file size to crypt_readv() + + */ + dict = dict_new(); + if (!dict) { + gf_log("crypt", GF_LOG_WARNING, "Can not alloc dict"); + ret = ENOMEM; + goto exit; + } + ret = dict_set(dict, + FSIZE_XATTR_PREFIX, + data_from_uint64(local->cur_file_size)); + if (ret) { + gf_log("crypt", GF_LOG_WARNING, "Can not set dict"); + goto exit; + } + STACK_WIND(frame, + prune_write, + this, + this->fops->readv, /* crypt_readv */ + local->fd, + get_atom_size(object), /* bytes to read */ + conf->aligned_offset, /* offset to read from */ + 0, + dict); + exit: + if (dict) + dict_unref(dict); + return ret; +} + +/* + * File prune is more complicated than expand. + * First we need to read the latest atom to not lose info + * needed for proper update. Also we need to make sure that + * every component of read-prune-write sequence leaves data + * consistent + * + * Non-cblock aligned prune is performed as read-prune-write + * sequence: + * + * 1) read the latest atom; + * 2) perform cblock-aligned prune + * 3) issue a write request for the end-of-file + */ +int32_t prune_file(call_frame_t *frame, xlator_t *this, uint64_t offset) +{ + int32_t ret; + + ret = prepare_for_prune(frame, this, offset); + if (ret) + return ret; + return read_prune_write(frame, this); +} + +int32_t expand_file(call_frame_t *frame, xlator_t *this, + uint64_t offset) +{ + int32_t ret; + crypt_local_t *local = frame->local; + + ret = prepare_for_submit_hole(frame, this, + local->old_file_size, + offset - local->old_file_size); + if (ret) + return ret; + submit_hole(frame, this); + return 0; +} + +static int32_t ftruncate_trivial_completion(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *buf, + dict_t *dict) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + local->prebuf = *buf; + local->postbuf = *buf; + + local->prebuf.ia_size = local->cur_file_size; + local->postbuf.ia_size = local->cur_file_size; + + get_one_call(frame); + put_one_call_ftruncate(frame, this); + return 0; +} + +static int32_t do_ftruncate(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *dict, + dict_t *xdata) +{ + data_t *data; + crypt_local_t *local = frame->local; + + if (op_ret) + goto error; + /* + * extract regular file size + */ + data = dict_get(dict, FSIZE_XATTR_PREFIX); + if (!data) { + gf_log("crypt", GF_LOG_WARNING, "Regular file size not found"); + op_errno = EIO; + goto error; + } + local->old_file_size = local->cur_file_size = data_to_uint64(data); + + if (local->data_conf.orig_offset == local->cur_file_size) { +#if DEBUG_CRYPT + gf_log("crypt", GF_LOG_DEBUG, + "trivial ftruncate (current file size %llu)", + (unsigned long long)local->cur_file_size); +#endif + goto trivial; + } + else if (local->data_conf.orig_offset < local->cur_file_size) { +#if DEBUG_CRYPT + gf_log("crypt", GF_LOG_DEBUG, "prune from %llu to %llu", + (unsigned long long)local->cur_file_size, + (unsigned long long)local->data_conf.orig_offset); +#endif + op_errno = prune_file(frame, + this, + local->data_conf.orig_offset); + } + else { +#if DEBUG_CRYPT + gf_log("crypt", GF_LOG_DEBUG, "expand from %llu to %llu", + (unsigned long long)local->cur_file_size, + (unsigned long long)local->data_conf.orig_offset); +#endif + op_errno = expand_file(frame, + this, + local->data_conf.orig_offset); + } + if (op_errno) + goto error; + return 0; + trivial: + STACK_WIND(frame, + ftruncate_trivial_completion, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fstat, + local->fd, + NULL); + return 0; + error: + /* + * finish with ftruncate + */ + local->op_ret = -1; + local->op_errno = op_errno; + + get_one_call_nolock(frame); + put_one_call_ftruncate(frame, this); + return 0; +} + +static int32_t crypt_ftruncate_finodelk_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) + goto error; + /* + * An access has been granted, + * retrieve file size first + */ + STACK_WIND(frame, + do_ftruncate, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fgetxattr, + local->fd, + FSIZE_XATTR_PREFIX, + NULL); + return 0; + error: + get_one_call_nolock(frame); + put_one_call_ftruncate(frame, this); + return 0; +} + +/* + * ftruncate is performed in 2 steps: + * . recieve file size; + * . expand or prune file. + */ +static int32_t crypt_ftruncate(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + off_t offset, + dict_t *xdata) +{ + int32_t ret; + crypt_local_t *local; + struct crypt_inode_info *info; + struct gf_flock lock = {0, }; + + local = crypt_alloc_local(frame, this, GF_FOP_FTRUNCATE); + if (!local) { + ret = ENOMEM; + goto error; + } + local->xattr = dict_new(); + if (!local->xattr) { + ret = ENOMEM; + goto error; + } + local->fd = fd_ref(fd); + info = local_get_inode_info(local, this); + if (info == NULL) { + ret = EINVAL; + goto error; + } + if (!object_alg_atomic(&info->cinfo)) { + ret = EINVAL; + goto error; + } + local->data_conf.orig_offset = offset; + if (xdata) + local->xdata = dict_ref(xdata); + + lock.l_len = 0; + lock.l_start = 0; + lock.l_type = F_WRLCK; + lock.l_whence = SEEK_SET; + + STACK_WIND(frame, + crypt_ftruncate_finodelk_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + fd, + F_SETLKW, + &lock, + NULL); + return 0; + error: + if (local && local->fd) + fd_unref(fd); + if (local && local->xdata) + dict_unref(xdata); + if (local && local->xattr) + dict_unref(local->xattr); + if (local && local->info) + free_inode_info(local->info); + + STACK_UNWIND_STRICT(ftruncate, frame, -1, ret, NULL, NULL, NULL); + return 0; +} + +/* ->flush_cbk() */ +int32_t truncate_end(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + STACK_UNWIND_STRICT(truncate, + frame, + op_ret, + op_errno, + &local->prebuf, + &local->postbuf, + local->xdata); + return 0; +} + +/* ftruncate_cbk() */ +int32_t truncate_flush(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + fd_t *fd = local->fd; + local->prebuf = *prebuf; + local->postbuf = *postbuf; + + STACK_WIND(frame, + truncate_end, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->flush, + fd, + NULL); + fd_unref(fd); + return 0; +} + +/* + * is called as ->open_cbk() + */ +static int32_t truncate_begin(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + fd_t *fd, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) { + fd_unref(fd); + STACK_UNWIND_STRICT(truncate, + frame, + op_ret, + op_errno, NULL, NULL, NULL); + return 0; + } + /* + * crypt_truncate() is implemented via crypt_ftruncate(), + * so the crypt xlator does STACK_WIND to itself here + */ + STACK_WIND(frame, + truncate_flush, + this, + this->fops->ftruncate, /* crypt_ftruncate */ + fd, + local->offset, + NULL); + return 0; +} + +/* + * crypt_truncate() is implemented via crypt_ftruncate() as a + * sequence crypt_open() - crypt_ftruncate() - truncate_flush() + */ +int32_t crypt_truncate(call_frame_t *frame, + xlator_t *this, + loc_t *loc, + off_t offset, + dict_t *xdata) +{ + fd_t *fd; + crypt_local_t *local; + +#if DEBUG_CRYPT + gf_log(this->name, GF_LOG_DEBUG, + "truncate file %s at offset %llu", + loc->path, (unsigned long long)offset); +#endif + local = crypt_alloc_local(frame, this, GF_FOP_TRUNCATE); + if (!local) + goto error; + + fd = fd_create(loc->inode, frame->root->pid); + if (!fd) { + gf_log(this->name, GF_LOG_ERROR, "Can not create fd"); + goto error; + } + local->fd = fd; + local->offset = offset; + local->xdata = xdata; + STACK_WIND(frame, + truncate_begin, + this, + this->fops->open, /* crypt_open() */ + loc, + O_RDWR, + fd, + NULL); + return 0; + error: + STACK_UNWIND_STRICT(truncate, frame, -1, EINVAL, NULL, NULL, NULL); + return 0; +} + +end_writeback_handler_t dispatch_end_writeback(glusterfs_fop_t fop) +{ + switch (fop) { + case GF_FOP_WRITE: + return end_writeback_writev; + case GF_FOP_FTRUNCATE: + return end_writeback_ftruncate; + default: + gf_log("crypt", GF_LOG_WARNING, "Bad wb operation %d", fop); + return NULL; + } +} + +/* + * true, if the caller needs metadata string + */ +static int32_t is_custom_mtd(dict_t *xdata) +{ + data_t *data; + uint32_t flags; + + if (!xdata) + return 0; + + data = dict_get(xdata, MSGFLAGS_PREFIX); + if (!data) + return 0; + if (data->len != sizeof(uint32_t)) { + gf_log("crypt", GF_LOG_WARNING, + "Bad msgflags size (%d)", data->len); + return -1; + } + flags = *((uint32_t *)data->data); + return msgflags_check_mtd_lock(&flags); +} + +static int32_t crypt_open_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + if (op_ret < 0) + gf_log(this->name, GF_LOG_WARNING, "mtd unlock failed (%d)", + op_errno); + put_one_call_open(frame); + return 0; +} + +static void crypt_open_tail(call_frame_t *frame, xlator_t *this) +{ + struct gf_flock lock = {0, }; + crypt_local_t *local = frame->local; + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + crypt_open_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); +} + +/* + * load private inode info at open time + * called as ->fgetxattr_cbk() + */ +static int load_mtd_open(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *dict, + dict_t *xdata) +{ + int32_t ret; + gf_boolean_t upload_info; + data_t *mtd; + uint64_t value = 0; + struct crypt_inode_info *info; + crypt_local_t *local = frame->local; + crypt_private_t *priv = this->private; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (local->fd->inode->ia_type == IA_IFLNK) + goto exit; + if (op_ret < 0) + goto exit; + /* + * first, check for cached info + */ + ret = inode_ctx_get(local->fd->inode, this, &value); + if (ret != -1) { + info = (struct crypt_inode_info *)(long)value; + if (info == NULL) { + gf_log(this->name, GF_LOG_WARNING, + "Inode info expected, but not found"); + local->op_ret = -1; + local->op_errno = EIO; + goto exit; + } + /* + * info has been found in the cache + */ + upload_info = _gf_false; + } + else { + /* + * info hasn't been found in the cache. + */ + info = alloc_inode_info(local, local->loc); + if (!info) { + local->op_ret = -1; + local->op_errno = ENOMEM; + goto exit; + } + init_inode_info_head(info, local->fd); + upload_info = _gf_true; + } + /* + * extract metadata + */ + mtd = dict_get(dict, CRYPTO_FORMAT_PREFIX); + if (!mtd) { + local->op_ret = -1; + local->op_errno = ENOENT; + gf_log (this->name, GF_LOG_WARNING, + "Format string wasn't found"); + goto exit; + } + /* + * authenticate metadata against the path + */ + ret = open_format((unsigned char *)mtd->data, + mtd->len, + local->loc, + info, + get_master_cinfo(priv), + local, + upload_info); + if (ret) { + local->op_ret = -1; + local->op_errno = ret; + goto exit; + } + if (upload_info) { + ret = init_inode_info_tail(info, get_master_cinfo(priv)); + if (ret) { + local->op_ret = -1; + local->op_errno = ret; + goto exit; + } + ret = inode_ctx_put(local->fd->inode, + this, (uint64_t)(long)info); + if (ret == -1) { + local->op_ret = -1; + local->op_errno = EIO; + goto exit; + } + } + if (local->custom_mtd) { + /* + * pass the metadata string to the customer + */ + ret = dict_set_static_bin(local->xdata, + CRYPTO_FORMAT_PREFIX, + mtd->data, + mtd->len); + if (ret) { + local->op_ret = -1; + local->op_errno = ret; + goto exit; + } + } + exit: + if (!local->custom_mtd) + crypt_open_tail(frame, this); + else + put_one_call_open(frame); + return 0; +} + +static int32_t crypt_open_finodelk_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0) { + gf_log(this->name, GF_LOG_WARNING, "finodelk (LOCK) failed"); + goto exit; + } + STACK_WIND(frame, + load_mtd_open, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fgetxattr, + local->fd, + CRYPTO_FORMAT_PREFIX, + NULL); + return 0; + exit: + put_one_call_open(frame); + return 0; +} + +/* + * verify metadata against the specified pathname + */ +static int32_t crypt_open_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + fd_t *fd, + dict_t *xdata) +{ + struct gf_flock lock = {0, }; + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (local->fd->inode->ia_type == IA_IFLNK) + goto exit; + if (op_ret < 0) + goto exit; + if (xdata) + local->xdata = dict_ref(xdata); + else if (local->custom_mtd){ + local->xdata = dict_new(); + if (!local->xdata) { + local->op_ret = -1; + local->op_errno = ENOMEM; + gf_log ("crypt", GF_LOG_ERROR, + "Can not get new dict for mtd string"); + goto exit; + } + } + lock.l_len = 0; + lock.l_start = 0; + lock.l_type = local->custom_mtd ? F_WRLCK : F_RDLCK; + lock.l_whence = SEEK_SET; + + STACK_WIND(frame, + crypt_open_finodelk_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + fd, + F_SETLKW, + &lock, + NULL); + return 0; + exit: + put_one_call_open(frame); + return 0; +} + +static int32_t crypt_open(call_frame_t *frame, + xlator_t *this, + loc_t *loc, + int32_t flags, + fd_t *fd, + dict_t *xdata) +{ + int32_t ret = ENOMEM; + crypt_local_t *local; + + local = crypt_alloc_local(frame, this, GF_FOP_OPEN); + if (!local) + goto error; + local->loc = GF_CALLOC(1, sizeof(*loc), gf_crypt_mt_loc); + if (!local->loc) { + ret = ENOMEM; + goto error; + } + memset(local->loc, 0, sizeof(*local->loc)); + ret = loc_copy(local->loc, loc); + if (ret) { + GF_FREE(local->loc); + goto error; + } + local->fd = fd_ref(fd); + + ret = is_custom_mtd(xdata); + if (ret < 0) { + loc_wipe(local->loc); + GF_FREE(local->loc); + ret = EINVAL; + goto error; + } + local->custom_mtd = ret; + + if ((flags & O_ACCMODE) == O_WRONLY) + /* + * we can't open O_WRONLY, because + * we need to do read-modify-write + */ + flags = (flags & ~O_ACCMODE) | O_RDWR; + /* + * Make sure that out translated offsets + * and counts won't be ignored + */ + flags &= ~O_APPEND; + get_one_call_nolock(frame); + STACK_WIND(frame, + crypt_open_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->open, + loc, + flags, + fd, + xdata); + return 0; + error: + STACK_UNWIND_STRICT(open, + frame, + -1, + ret, + NULL, + NULL); + return 0; +} + +static int32_t init_inode_info_tail(struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + struct object_cipher_info *object = &info->cinfo; + +#if DEBUG_CRYPT + gf_log("crypt", GF_LOG_DEBUG, "Init inode info for object %s", + uuid_utoa(info->oid)); +#endif + ret = data_cipher_algs[object->o_alg][object->o_mode].set_private(info, + master); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Set private info failed"); + return ret; + } + return 0; +} + +/* + * Init inode info at ->create() time + */ +static void init_inode_info_create(struct crypt_inode_info *info, + struct master_cipher_info *master, + data_t *data) +{ + struct object_cipher_info *object; + + info->nr_minor = CRYPT_XLATOR_ID; + memcpy(info->oid, data->data, data->len); + + object = &info->cinfo; + + object->o_alg = master->m_alg; + object->o_mode = master->m_mode; + object->o_block_bits = master->m_block_bits; + object->o_dkey_size = master->m_dkey_size; +} + +static void init_inode_info_head(struct crypt_inode_info *info, fd_t *fd) +{ + memcpy(info->oid, fd->inode->gfid, sizeof(uuid_t)); +} + +static int32_t crypt_create_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_private_t *priv = this->private; + crypt_local_t *local = frame->local; + struct crypt_inode_info *info = local->info; + fd_t *local_fd = local->fd; + dict_t *local_xdata = local->xdata; + inode_t *local_inode = local->inode; + + if (op_ret < 0) { + free_inode_info(info); + goto unwind; + } + op_errno = init_inode_info_tail(info, get_master_cinfo(priv)); + if (op_errno) { + op_ret = -1; + free_inode_info(info); + goto unwind; + } + /* + * FIXME: drop major subversion number + */ + op_ret = inode_ctx_put(local->fd->inode, this, (uint64_t)(long)info); + if (op_ret == -1) { + op_errno = EIO; + free_inode_info(info); + goto unwind; + } + unwind: + free_format(local); + STACK_UNWIND_STRICT(create, + frame, + op_ret, + op_errno, + local_fd, + local_inode, + &local->buf, + &local->prebuf, + &local->postbuf, + local_xdata); + fd_unref(local_fd); + inode_unref(local_inode); + if (local_xdata) + dict_unref(local_xdata); + return 0; +} + +static int crypt_create_tail(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + struct gf_flock lock = {0, }; + crypt_local_t *local = frame->local; + fd_t *local_fd = local->fd; + dict_t *local_xdata = local->xdata; + inode_t *local_inode = local->inode; + + dict_unref(local->xattr); + + if (op_ret < 0) + goto error; + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + crypt_create_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + return 0; + error: + free_inode_info(local->info); + free_format(local); + + STACK_UNWIND_STRICT(create, + frame, + op_ret, + op_errno, + local_fd, + local_inode, + &local->buf, + &local->prebuf, + &local->postbuf, + local_xdata); + + fd_unref(local_fd); + inode_unref(local_inode); + if (local_xdata) + dict_unref(local_xdata); + return 0; +} + +static int32_t crypt_create_finodelk_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + struct crypt_inode_info *info = local->info; + + if (op_ret < 0) + goto error; + + STACK_WIND(frame, + crypt_create_tail, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fsetxattr, + local->fd, + local->xattr, /* CRYPTO_FORMAT_PREFIX */ + 0, + NULL); + return 0; + error: + free_inode_info(info); + free_format(local); + fd_unref(local->fd); + dict_unref(local->xattr); + if (local->xdata) + dict_unref(local->xdata); + + STACK_UNWIND_STRICT(create, + frame, + op_ret, + op_errno, + NULL, + NULL, + NULL, + NULL, + NULL, + NULL); + return 0; +} + +/* + * Create and store crypt-specific format on disk; + * Populate cache with private inode info + */ +static int32_t crypt_create_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + fd_t *fd, + inode_t *inode, + struct iatt *buf, + struct iatt *preparent, + struct iatt *postparent, + dict_t *xdata) +{ + struct gf_flock lock = {0, }; + crypt_local_t *local = frame->local; + struct crypt_inode_info *info = local->info; + + if (op_ret < 0) + goto error; + if (xdata) + local->xdata = dict_ref(xdata); + local->inode = inode_ref(inode); + local->buf = *buf; + local->prebuf = *preparent; + local->postbuf = *postparent; + + lock.l_len = 0; + lock.l_start = 0; + lock.l_type = F_WRLCK; + lock.l_whence = SEEK_SET; + + STACK_WIND(frame, + crypt_create_finodelk_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + return 0; + error: + free_inode_info(info); + free_format(local); + fd_unref(local->fd); + dict_unref(local->xattr); + + STACK_UNWIND_STRICT(create, + frame, + op_ret, + op_errno, + NULL, NULL, NULL, + NULL, NULL, NULL); + return 0; +} + +static int32_t crypt_create(call_frame_t *frame, + xlator_t *this, + loc_t *loc, + int32_t flags, + mode_t mode, + mode_t umask, + fd_t *fd, + dict_t *xdata) +{ + int ret; + data_t *data; + crypt_local_t *local; + crypt_private_t *priv; + struct master_cipher_info *master; + struct crypt_inode_info *info; + + priv = this->private; + master = get_master_cinfo(priv); + + if (master_alg_atomic(master)) { + /* + * We can't open O_WRONLY, because we + * need to do read-modify-write. + */ + if ((flags & O_ACCMODE) == O_WRONLY) + flags = (flags & ~O_ACCMODE) | O_RDWR; + /* + * Make sure that out translated offsets + * and counts won't be ignored + */ + flags &= ~O_APPEND; + } + local = crypt_alloc_local(frame, this, GF_FOP_CREATE); + if (!local) { + ret = ENOMEM; + goto error; + } + data = dict_get(xdata, "gfid-req"); + if (!data) { + ret = EINVAL; + gf_log("crypt", GF_LOG_WARNING, "gfid not found"); + goto error; + } + if (data->len != sizeof(uuid_t)) { + ret = EINVAL; + gf_log("crypt", GF_LOG_WARNING, + "bad gfid size (%d), should be %d", + (int)data->len, (int)sizeof(uuid_t)); + goto error; + } + info = alloc_inode_info(local, loc); + if (!info){ + ret = ENOMEM; + goto error; + } + /* + * NOTE: + * format has to be created BEFORE + * proceeding to the untrusted server + */ + ret = alloc_format_create(local); + if (ret) { + free_inode_info(info); + goto error; + } + init_inode_info_create(info, master, data); + + ret = create_format(local->format, + loc, + info, + master); + if (ret) { + free_inode_info(info); + goto error; + } + local->xattr = dict_new(); + if (!local->xattr) { + free_inode_info(info); + free_format(local); + goto error; + } + ret = dict_set_static_bin(local->xattr, + CRYPTO_FORMAT_PREFIX, + local->format, + new_format_size()); + if (ret) { + dict_unref(local->xattr); + free_inode_info(info); + free_format(local); + goto error; + } + ret = dict_set(local->xattr, FSIZE_XATTR_PREFIX, data_from_uint64(0)); + if (ret) { + dict_unref(local->xattr); + free_inode_info(info); + free_format(local); + goto error; + } + local->fd = fd_ref(fd); + + STACK_WIND(frame, + crypt_create_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->create, + loc, + flags, + mode, + umask, + fd, + xdata); + return 0; + error: + gf_log("crypt", GF_LOG_WARNING, "can not create file"); + STACK_UNWIND_STRICT(create, + frame, + -1, + ret, + NULL, NULL, NULL, + NULL, NULL, NULL); + return 0; +} + +/* + * FIXME: this should depends on the version of format string + */ +static int32_t filter_crypt_xattr(dict_t *dict, + char *key, data_t *value, void *data) +{ + dict_del(dict, key); + return 0; +} + +static int32_t crypt_fsetxattr(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + dict_t *dict, + int32_t flags, dict_t *xdata) +{ + dict_foreach_fnmatch(dict, "trusted.glusterfs.crypt*", + filter_crypt_xattr, NULL); + STACK_WIND(frame, + default_fsetxattr_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fsetxattr, + fd, + dict, + flags, + xdata); + return 0; +} + +/* + * TBD: verify file metadata before wind + */ +static int32_t crypt_setxattr(call_frame_t *frame, + xlator_t *this, + loc_t *loc, + dict_t *dict, + int32_t flags, dict_t *xdata) +{ + dict_foreach_fnmatch(dict, "trusted.glusterfs.crypt*", + filter_crypt_xattr, NULL); + STACK_WIND(frame, + default_setxattr_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->setxattr, + loc, + dict, + flags, + xdata); + return 0; +} + +/* + * called as flush_cbk() + */ +static int32_t linkop_end(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + linkop_unwind_handler_t unwind_fn; + unwind_fn = linkop_unwind_dispatch(local->fop); + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret < 0 && + op_errno == ENOENT && + local->loc->inode->ia_type == IA_IFLNK) { + local->op_ret = 0; + local->op_errno = 0; + } + unwind_fn(frame); + return 0; +} + +/* + * unpin inode on the server + */ +static int32_t link_flush(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + inode_t *inode, + struct iatt *buf, + struct iatt *preparent, + struct iatt *postparent, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto error; + if (local->xdata) { + dict_unref(local->xdata); + local->xdata = NULL; + } + if (xdata) + local->xdata = dict_ref(xdata); + local->inode = inode_ref(inode); + local->buf = *buf; + local->prebuf = *preparent; + local->postbuf = *postparent; + + STACK_WIND(frame, + linkop_end, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->flush, + local->fd, + NULL); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + link_unwind(frame); + return 0; +} + +void link_unwind(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + dict_t *xdata; + dict_t *xattr; + inode_t *inode; + + if (!local) { + STACK_UNWIND_STRICT(link, + frame, + -1, + ENOMEM, + NULL, + NULL, + NULL, + NULL, + NULL); + return; + } + xdata = local->xdata; + xattr = local->xattr; + inode = local->inode; + + if (local->loc){ + loc_wipe(local->loc); + GF_FREE(local->loc); + } + if (local->newloc) { + loc_wipe(local->newloc); + GF_FREE(local->newloc); + } + if (local->fd) + fd_unref(local->fd); + if (local->format) + GF_FREE(local->format); + + STACK_UNWIND_STRICT(link, + frame, + local->op_ret, + local->op_errno, + inode, + &local->buf, + &local->prebuf, + &local->postbuf, + xdata); + if (xdata) + dict_unref(xdata); + if (xattr) + dict_unref(xattr); + if (inode) + inode_unref(inode); +} + +void link_wind(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + + STACK_WIND(frame, + link_flush, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->link, + local->loc, + local->newloc, + local->xdata); +} + +/* + * unlink() + */ +static int32_t unlink_flush(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *preparent, + struct iatt *postparent, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto error; + local->prebuf = *preparent; + local->postbuf = *postparent; + if (local->xdata) { + dict_unref(local->xdata); + local->xdata = NULL; + } + if (xdata) + local->xdata = dict_ref(xdata); + + STACK_WIND(frame, + linkop_end, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->flush, + local->fd, + NULL); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + unlink_unwind(frame); + return 0; +} + +void unlink_unwind(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + dict_t *xdata; + dict_t *xattr; + + if (!local) { + STACK_UNWIND_STRICT(unlink, + frame, + -1, + ENOMEM, + NULL, + NULL, + NULL); + return; + } + xdata = local->xdata; + xattr = local->xattr; + if (local->loc){ + loc_wipe(local->loc); + GF_FREE(local->loc); + } + if (local->fd) + fd_unref(local->fd); + if (local->format) + GF_FREE(local->format); + + STACK_UNWIND_STRICT(unlink, + frame, + local->op_ret, + local->op_errno, + &local->prebuf, + &local->postbuf, + xdata); + if (xdata) + dict_unref(xdata); + if (xattr) + dict_unref(xattr); +} + +void unlink_wind(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + + STACK_WIND(frame, + unlink_flush, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->unlink, + local->loc, + local->flags, + local->xdata); +} + +void rename_unwind(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + dict_t *xdata; + dict_t *xattr; + struct iatt *prenewparent; + struct iatt *postnewparent; + + if (!local) { + STACK_UNWIND_STRICT(rename, + frame, + -1, + ENOMEM, + NULL, + NULL, + NULL, + NULL, + NULL, + NULL); + return; + } + xdata = local->xdata; + xattr = local->xattr; + prenewparent = local->prenewparent; + postnewparent = local->postnewparent; + + if (local->loc){ + loc_wipe(local->loc); + GF_FREE(local->loc); + } + if (local->newloc){ + loc_wipe(local->newloc); + GF_FREE(local->newloc); + } + if (local->fd) + fd_unref(local->fd); + if (local->format) + GF_FREE(local->format); + + STACK_UNWIND_STRICT(rename, + frame, + local->op_ret, + local->op_errno, + &local->buf, + &local->prebuf, + &local->postbuf, + prenewparent, + postnewparent, + xdata); + if (xdata) + dict_unref(xdata); + if (xattr) + dict_unref(xattr); + if (prenewparent) + GF_FREE(prenewparent); + if (postnewparent) + GF_FREE(postnewparent); +} + +/* + * called as flush_cbk() + */ +static int32_t rename_end(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + local->op_ret = op_ret; + local->op_errno = op_errno; + + rename_unwind(frame); + return 0; +} + +static int32_t rename_flush(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *buf, + struct iatt *preoldparent, + struct iatt *postoldparent, + struct iatt *prenewparent, + struct iatt *postnewparent, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto error; + dict_unref(local->xdata); + local->xdata = NULL; + if (xdata) + local->xdata = dict_ref(xdata); + + local->buf = *buf; + local->prebuf = *preoldparent; + local->postbuf = *postoldparent; + if (prenewparent) { + local->prenewparent = GF_CALLOC(1, sizeof(*prenewparent), + gf_crypt_mt_iatt); + if (!local->prenewparent) { + op_errno = ENOMEM; + goto error; + } + *local->prenewparent = *prenewparent; + } + if (postnewparent) { + local->postnewparent = GF_CALLOC(1, sizeof(*postnewparent), + gf_crypt_mt_iatt); + if (!local->postnewparent) { + op_errno = ENOMEM; + goto error; + } + *local->postnewparent = *postnewparent; + } + STACK_WIND(frame, + rename_end, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->flush, + local->fd, + NULL); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + rename_unwind(frame); + return 0; +} + +void rename_wind(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + + STACK_WIND(frame, + rename_flush, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->rename, + local->loc, + local->newloc, + local->xdata); +} + +static int32_t __do_linkop(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + linkop_wind_handler_t wind_fn; + linkop_unwind_handler_t unwind_fn; + + wind_fn = linkop_wind_dispatch(local->fop); + unwind_fn = linkop_unwind_dispatch(local->fop); + + local->op_ret = op_ret; + local->op_errno = op_errno; + + if (op_ret >= 0) + wind_fn(frame, this); + else { + gf_log(this->name, GF_LOG_WARNING, "mtd unlock failed (%d)", + op_errno); + unwind_fn(frame); + } + return 0; +} + +static int32_t do_linkop(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + struct gf_flock lock = {0, }; + crypt_local_t *local = frame->local; + linkop_unwind_handler_t unwind_fn; + + unwind_fn = linkop_unwind_dispatch(local->fop); + local->op_ret = op_ret; + local->op_errno = op_errno; + + if(op_ret < 0) + goto error; + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + __do_linkop, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + return 0; + error: + unwind_fn(frame); + return 0; +} + +/* + * Update the metadata string (against the new pathname); + * submit the result + */ +static int32_t linkop_begin(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + fd_t *fd, + dict_t *xdata) +{ + gf_boolean_t upload_info; + crypt_local_t *local = frame->local; + crypt_private_t *priv = this->private; + struct crypt_inode_info *info; + data_t *old_mtd; + uint32_t new_mtd_size; + uint64_t value = 0; + void (*unwind_fn)(call_frame_t *frame); + void (*wind_fn)(call_frame_t *frame, xlator_t *this); + mtd_op_t mop; + + wind_fn = linkop_wind_dispatch(local->fop); + unwind_fn = linkop_unwind_dispatch(local->fop); + mop = linkop_mtdop_dispatch(local->fop); + + if (local->fd->inode->ia_type == IA_IFLNK) + goto wind; + if (op_ret < 0) + /* + * verification failed + */ + goto error; + + old_mtd = dict_get(xdata, CRYPTO_FORMAT_PREFIX); + if (!old_mtd) { + op_errno = EIO; + gf_log (this->name, GF_LOG_DEBUG, + "Metadata string wasn't found"); + goto error; + } + new_mtd_size = format_size(mop, old_mtd->len); + op_errno = alloc_format(local, new_mtd_size); + if (op_errno) + goto error; + /* + * check for cached info + */ + op_ret = inode_ctx_get(fd->inode, this, &value); + if (op_ret != -1) { + info = (struct crypt_inode_info *)(long)value; + if (info == NULL) { + gf_log (this->name, GF_LOG_WARNING, + "Inode info was not found"); + op_errno = EINVAL; + goto error; + } + /* + * info was found in the cache + */ + local->info = info; + upload_info = _gf_false; + } + else { + /* + * info wasn't found in the cache; + */ + info = alloc_inode_info(local, local->loc); + if (!info) + goto error; + init_inode_info_head(info, fd); + local->info = info; + upload_info = _gf_true; + } + op_errno = open_format((unsigned char *)old_mtd->data, + old_mtd->len, + local->loc, + info, + get_master_cinfo(priv), + local, + upload_info); + if (op_errno) + goto error; + if (upload_info == _gf_true) { + op_errno = init_inode_info_tail(info, + get_master_cinfo(priv)); + if (op_errno) + goto error; + op_errno = inode_ctx_put(fd->inode, this, + (uint64_t)(long)(info)); + if (op_errno == -1) { + op_errno = EIO; + goto error; + } + } + /* + * update the format string (append/update/cup a MAC) + */ + op_errno = update_format(local->format, + (unsigned char *)old_mtd->data, + old_mtd->len, + local->mac_idx, + mop, + local->newloc, + info, + get_master_cinfo(priv), + local); + if (op_errno) + goto error; + /* + * store the new format string on the server + */ + if (new_mtd_size) { + op_errno = dict_set_static_bin(local->xattr, + CRYPTO_FORMAT_PREFIX, + local->format, + new_mtd_size); + if (op_errno) + goto error; + } + STACK_WIND(frame, + do_linkop, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->setxattr, + local->loc, + local->xattr, + 0, + NULL); + return 0; + wind: + wind_fn(frame, this); + return 0; + error: + local->op_ret = -1; + local->op_errno = op_errno; + unwind_fn(frame); + return 0; +} + +static int32_t linkop_grab_local(call_frame_t *frame, + xlator_t *this, + loc_t *oldloc, + loc_t *newloc, + int flags, dict_t *xdata, + glusterfs_fop_t op) +{ + int32_t ret = ENOMEM; + fd_t *fd; + crypt_local_t *local; + + local = crypt_alloc_local(frame, this, op); + if (!local) + goto error; + if (xdata) + local->xdata = dict_ref(xdata); + + fd = fd_create(oldloc->inode, frame->root->pid); + if (!fd) { + gf_log(this->name, GF_LOG_ERROR, "Can not create fd"); + goto error; + } + local->fd = fd; + local->flags = flags; + local->loc = GF_CALLOC(1, sizeof(*oldloc), gf_crypt_mt_loc); + if (!local->loc) + goto error; + memset(local->loc, 0, sizeof(*local->loc)); + ret = loc_copy(local->loc, oldloc); + if (ret) { + GF_FREE(local->loc); + local->loc = NULL; + goto error; + } + if (newloc) { + local->newloc = GF_CALLOC(1, sizeof(*newloc), gf_crypt_mt_loc); + if (!local->newloc) { + GF_FREE(local->loc); + loc_wipe(local->loc); + goto error; + } + memset(local->newloc, 0, sizeof(*local->newloc)); + ret = loc_copy(local->newloc, newloc); + if (ret) { + GF_FREE(local->loc); + loc_wipe(local->loc); + GF_FREE(local->newloc); + goto error; + } + } + local->xattr = dict_new(); + if (!local->xattr) { + gf_log(this->name, GF_LOG_ERROR, "Can not create dict"); + ret = ENOMEM; + goto error; + } + return 0; + error: + if (local->xdata) + dict_unref(local->xdata); + if (local->fd) + fd_unref(local->fd); + local->fd = 0; + local->loc = NULL; + local->newloc = NULL; + + local->op_ret = -1; + local->op_errno = ret; + + return ret; +} + +/* + * read and verify locked metadata against the old pathname (via open); + * update the metadata string in accordance with the new pathname; + * submit modified metadata; + * wind; + */ +static int32_t linkop(call_frame_t *frame, + xlator_t *this, + loc_t *oldloc, + loc_t *newloc, + int flags, + dict_t *xdata, + glusterfs_fop_t op) +{ + int32_t ret; + dict_t *dict; + crypt_local_t *local; + void (*unwind_fn)(call_frame_t *frame); + + unwind_fn = linkop_unwind_dispatch(op); + + ret = linkop_grab_local(frame, this, oldloc, newloc, flags, xdata, op); + local = frame->local; + if (ret) + goto error; + dict = dict_new(); + if (!dict) { + gf_log(this->name, GF_LOG_ERROR, "Can not create dict"); + ret = ENOMEM; + goto error; + } + /* + * Set a message to crypt_open() that we need + * locked metadata string. + * All link operations (link, unlink, rename) + * need write lock + */ + msgflags_set_mtd_wlock(&local->msgflags); + ret = dict_set_static_bin(dict, + MSGFLAGS_PREFIX, + &local->msgflags, + sizeof(local->msgflags)); + if (ret) { + gf_log(this->name, GF_LOG_ERROR, "Can not set dict"); + dict_unref(dict); + goto error; + } + /* + * verify metadata against the old pathname + * and retrieve locked metadata string + */ + STACK_WIND(frame, + linkop_begin, + this, + this->fops->open, /* crypt_open() */ + oldloc, + O_RDWR, + local->fd, + dict); + dict_unref(dict); + return 0; + error: + local->op_ret = -1; + local->op_errno = ret; + unwind_fn(frame); + return 0; +} + +static int32_t crypt_link(call_frame_t *frame, xlator_t *this, + loc_t *oldloc, loc_t *newloc, dict_t *xdata) +{ + return linkop(frame, this, oldloc, newloc, 0, xdata, GF_FOP_LINK); +} + +static int32_t crypt_unlink(call_frame_t *frame, xlator_t *this, + loc_t *loc, int flags, dict_t *xdata) +{ + return linkop(frame, this, loc, NULL, flags, xdata, GF_FOP_UNLINK); +} + +static int32_t crypt_rename(call_frame_t *frame, xlator_t *this, + loc_t *oldloc, loc_t *newloc, dict_t *xdata) +{ + return linkop(frame, this, oldloc, newloc, 0, xdata, GF_FOP_RENAME); +} + +static void put_one_call_open(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + if (put_one_call(local)) { + fd_t *fd = local->fd; + loc_t *loc = local->loc; + dict_t *xdata = local->xdata; + + STACK_UNWIND_STRICT(open, + frame, + local->op_ret, + local->op_errno, + fd, + xdata); + fd_unref(fd); + if (xdata) + dict_unref(xdata); + loc_wipe(loc); + GF_FREE(loc); + } +} + +static int32_t __crypt_readv_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + fd_t *local_fd = local->fd; + dict_t *local_xdata = local->xdata; + /* read deals with data configs only */ + struct iovec *avec = local->data_conf.avec; + char **pool = local->data_conf.pool; + int blocks_in_pool = local->data_conf.blocks_in_pool; + struct iobref *iobref = local->iobref; + struct iobref *iobref_data = local->iobref_data; + + if (op_ret < 0) { + gf_log(this->name, GF_LOG_WARNING, + "readv unlock failed (%d)", op_errno); + if (local->op_ret >= 0) { + local->op_ret = op_ret; + local->op_errno = op_errno; + } + } + dump_plain_text(local, avec); + + gf_log("crypt", GF_LOG_DEBUG, + "readv: ret_to_user: %d, iovec len: %d, ia_size: %llu", + (int)(local->rw_count > 0 ? local->rw_count : local->op_ret), + (int)(local->rw_count > 0 ? iovec_get_size(avec, local->data_conf.acount) : 0), + (unsigned long long)local->buf.ia_size); + + STACK_UNWIND_STRICT(readv, + frame, + local->rw_count > 0 ? local->rw_count : local->op_ret, + local->op_errno, + avec, + avec ? local->data_conf.acount : 0, + &local->buf, + local->iobref, + local_xdata); + + free_avec(avec, pool, blocks_in_pool); + fd_unref(local_fd); + if (local_xdata) + dict_unref(local_xdata); + if (iobref) + iobref_unref(iobref); + if (iobref_data) + iobref_unref(iobref_data); + return 0; +} + +static void crypt_readv_done(call_frame_t *frame, xlator_t *this) +{ + if (parent_is_crypt_xlator(frame, this)) + /* + * don't unlock (it will be done by the parent) + */ + __crypt_readv_done(frame, NULL, this, 0, 0, NULL); + else { + crypt_local_t *local = frame->local; + struct gf_flock lock = {0, }; + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + __crypt_readv_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + } +} + +static void put_one_call_readv(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + if (put_one_call(local)) + crypt_readv_done(frame, this); +} + +static int32_t __crypt_writev_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + fd_t *local_fd = local->fd; + dict_t *local_xdata = local->xdata; + int32_t ret_to_user; + + if (local->xattr) + dict_unref(local->xattr); + /* + * Calculate amout of butes to be returned + * to user. We need to subtract paddings that + * have been written as a part of atom. + */ + /* + * subtract head padding + */ + if (local->rw_count == 0) + /* + * Nothing has been written, it must be an error + */ + ret_to_user = local->op_ret; + else if (local->rw_count <= local->data_conf.off_in_head) { + gf_log("crypt", GF_LOG_WARNING, "Incomplete write"); + ret_to_user = 0; + } + else + ret_to_user = local->rw_count - + local->data_conf.off_in_head; + /* + * subtract tail padding + */ + if (ret_to_user > local->data_conf.orig_size) + ret_to_user = local->data_conf.orig_size; + + if (local->iobref) + iobref_unref(local->iobref); + if (local->iobref_data) + iobref_unref(local->iobref_data); + free_avec_data(local); + free_avec_hole(local); + + gf_log("crypt", GF_LOG_DEBUG, + "writev: ret_to_user: %d", ret_to_user); + + STACK_UNWIND_STRICT(writev, + frame, + ret_to_user, + local->op_errno, + &local->prebuf, + &local->postbuf, + local_xdata); + fd_unref(local_fd); + if (local_xdata) + dict_unref(local_xdata); + return 0; +} + +static int32_t crypt_writev_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + gf_log("crypt", GF_LOG_WARNING, "can not update file size"); + + if (parent_is_crypt_xlator(frame, this)) + /* + * don't unlock (it will be done by the parent) + */ + __crypt_writev_done(frame, NULL, this, 0, 0, NULL); + else { + struct gf_flock lock = {0, }; + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + __crypt_writev_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + } + return 0; +} + +static void put_one_call_writev(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + if (put_one_call(local)) { + if (local->update_disk_file_size) { + int32_t ret; + /* + * update file size, unlock the file and unwind + */ + ret = dict_set(local->xattr, + FSIZE_XATTR_PREFIX, + data_from_uint64(local->cur_file_size)); + if (ret) { + gf_log("crypt", GF_LOG_WARNING, + "can not set key to update file size"); + crypt_writev_done(frame, NULL, + this, 0, 0, NULL); + return; + } + gf_log("crypt", GF_LOG_DEBUG, + "Updating disk file size to %llu", + (unsigned long long)local->cur_file_size); + STACK_WIND(frame, + crypt_writev_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fsetxattr, + local->fd, + local->xattr, /* CRYPTO_FORMAT_PREFIX */ + 0, + NULL); + } + else + crypt_writev_done(frame, NULL, this, 0, 0, NULL); + } +} + +static int32_t __crypt_ftruncate_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + fd_t *local_fd = local->fd; + dict_t *local_xdata = local->xdata; + char *iobase = local->vec.iov_base; + + if (op_ret < 0) { + gf_log(this->name, GF_LOG_WARNING, + "ftruncate unlock failed (%d)", op_errno); + if (local->op_ret >= 0) { + local->op_ret = op_ret; + local->op_errno = op_errno; + } + } + if (local->iobref_data) + iobref_unref(local->iobref_data); + free_avec_data(local); + free_avec_hole(local); + + gf_log("crypt", GF_LOG_DEBUG, + "ftruncate, return to user: presize=%llu, postsize=%llu", + (unsigned long long)local->prebuf.ia_size, + (unsigned long long)local->postbuf.ia_size); + + STACK_UNWIND_STRICT(ftruncate, + frame, + local->op_ret < 0 ? -1 : 0, + local->op_errno, + &local->prebuf, + &local->postbuf, + local_xdata); + fd_unref(local_fd); + if (local_xdata) + dict_unref(local_xdata); + if (iobase) + GF_FREE(iobase); + return 0; +} + +static int32_t crypt_ftruncate_done(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *xdata) +{ + crypt_local_t *local = frame->local; + struct gf_flock lock = {0, }; + + dict_unref(local->xattr); + if (op_ret < 0) + gf_log("crypt", GF_LOG_WARNING, "can not update file size"); + + lock.l_type = F_UNLCK; + lock.l_whence = SEEK_SET; + lock.l_start = 0; + lock.l_len = 0; + lock.l_pid = 0; + + STACK_WIND(frame, + __crypt_ftruncate_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->finodelk, + this->name, + local->fd, + F_SETLKW, + &lock, + NULL); + return 0; +} + +static void put_one_call_ftruncate(call_frame_t *frame, xlator_t *this) +{ + crypt_local_t *local = frame->local; + if (put_one_call(local)) { + if (local->update_disk_file_size) { + int32_t ret; + /* + * update file size, unlock the file and unwind + */ + ret = dict_set(local->xattr, + FSIZE_XATTR_PREFIX, + data_from_uint64(local->cur_file_size)); + if (ret) { + gf_log("crypt", GF_LOG_WARNING, + "can not set key to update file size"); + crypt_ftruncate_done(frame, NULL, + this, 0, 0, NULL); + return; + } + gf_log("crypt", GF_LOG_DEBUG, + "Updating disk file size to %llu", + (unsigned long long)local->cur_file_size); + STACK_WIND(frame, + crypt_ftruncate_done, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fsetxattr, + local->fd, + local->xattr, /* CRYPTO_FORMAT_PREFIX */ + 0, + NULL); + } + else + crypt_ftruncate_done(frame, NULL, this, 0, 0, NULL); + } +} + +/* + * load regular file size for some FOPs + */ +static int32_t load_file_size(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + dict_t *dict, + dict_t *xdata) +{ + data_t *data; + crypt_local_t *local = frame->local; + + dict_t *local_xdata = local->xdata; + inode_t *local_inode = local->inode; + + if (op_ret < 0) + goto unwind; + /* + * load regular file size + */ + data = dict_get(dict, FSIZE_XATTR_PREFIX); + if (!data) { + if (local->xdata) + dict_unref(local->xdata); + gf_log("crypt", GF_LOG_WARNING, "Regular file size not found"); + op_ret = -1; + op_errno = EIO; + goto unwind; + } + local->buf.ia_size = data_to_uint64(data); + + gf_log(this->name, GF_LOG_DEBUG, + "FOP %d: Translate regular file to %llu", + local->fop, + (unsigned long long)local->buf.ia_size); + unwind: + if (local->fd) + fd_unref(local->fd); + if (local->loc) { + loc_wipe(local->loc); + GF_FREE(local->loc); + } + switch (local->fop) { + case GF_FOP_FSTAT: + STACK_UNWIND_STRICT(fstat, + frame, + op_ret, + op_errno, + op_ret >= 0 ? &local->buf : NULL, + local->xdata); + break; + case GF_FOP_STAT: + STACK_UNWIND_STRICT(stat, + frame, + op_ret, + op_errno, + op_ret >= 0 ? &local->buf : NULL, + local->xdata); + break; + case GF_FOP_LOOKUP: + STACK_UNWIND_STRICT(lookup, + frame, + op_ret, + op_errno, + op_ret >= 0 ? local->inode : NULL, + op_ret >= 0 ? &local->buf : NULL, + local->xdata, + op_ret >= 0 ? &local->postbuf : NULL); + break; + case GF_FOP_READ: + STACK_UNWIND_STRICT(readv, + frame, + op_ret, + op_errno, + NULL, + 0, + op_ret >= 0 ? &local->buf : NULL, + NULL, + NULL); + break; + default: + gf_log(this->name, GF_LOG_WARNING, + "Improper file operation %d", local->fop); + } + if (local_xdata) + dict_unref(local_xdata); + if (local_inode) + inode_unref(local_inode); + return 0; +} + +static int32_t crypt_stat_common_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *buf, dict_t *xdata) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto unwind; + if (!IA_ISREG(buf->ia_type)) + goto unwind; + + local->buf = *buf; + if (xdata) + local->xdata = dict_ref(xdata); + + switch (local->fop) { + case GF_FOP_FSTAT: + STACK_WIND(frame, + load_file_size, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fgetxattr, + local->fd, + FSIZE_XATTR_PREFIX, + NULL); + break; + case GF_FOP_STAT: + STACK_WIND(frame, + load_file_size, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->getxattr, + local->loc, + FSIZE_XATTR_PREFIX, + NULL); + break; + default: + gf_log (this->name, GF_LOG_WARNING, + "Improper file operation %d", local->fop); + } + return 0; + unwind: + if (local->fd) + fd_unref(local->fd); + if (local->loc) { + loc_wipe(local->loc); + GF_FREE(local->loc); + } + switch (local->fop) { + case GF_FOP_FSTAT: + STACK_UNWIND_STRICT(fstat, + frame, + op_ret, + op_errno, + op_ret >= 0 ? buf : NULL, + op_ret >= 0 ? xdata : NULL); + break; + case GF_FOP_STAT: + STACK_UNWIND_STRICT(stat, + frame, + op_ret, + op_errno, + op_ret >= 0 ? buf : NULL, + op_ret >= 0 ? xdata : NULL); + break; + default: + gf_log (this->name, GF_LOG_WARNING, + "Improper file operation %d", local->fop); + } + return 0; +} + +static int32_t crypt_fstat(call_frame_t *frame, + xlator_t *this, + fd_t *fd, dict_t *xdata) +{ + crypt_local_t *local; + + local = crypt_alloc_local(frame, this, GF_FOP_FSTAT); + if (!local) + goto error; + local->fd = fd_ref(fd); + STACK_WIND(frame, + crypt_stat_common_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->fstat, + fd, + xdata); + return 0; + error: + STACK_UNWIND_STRICT(fstat, + frame, + -1, + ENOMEM, + NULL, + NULL); + return 0; +} + +static int32_t crypt_stat(call_frame_t *frame, + xlator_t *this, + loc_t *loc, dict_t *xdata) +{ + int32_t ret; + crypt_local_t *local; + + local = crypt_alloc_local(frame, this, GF_FOP_STAT); + if (!local) + goto error; + local->loc = GF_CALLOC(1, sizeof(*loc), gf_crypt_mt_loc); + if (!local->loc) + goto error; + memset(local->loc, 0, sizeof(*local->loc)); + ret = loc_copy(local->loc, loc); + if (ret) { + GF_FREE(local->loc); + goto error; + } + STACK_WIND(frame, + crypt_stat_common_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->stat, + loc, + xdata); + return 0; + error: + STACK_UNWIND_STRICT(stat, + frame, + -1, + ENOMEM, + NULL, + NULL); + return 0; +} + +static int32_t crypt_lookup_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + inode_t *inode, + struct iatt *buf, dict_t *xdata, + struct iatt *postparent) +{ + crypt_local_t *local = frame->local; + + if (op_ret < 0) + goto unwind; + if (!IA_ISREG(buf->ia_type)) + goto unwind; + + local->inode = inode_ref(inode); + local->buf = *buf; + local->postbuf = *postparent; + if (xdata) + local->xdata = dict_ref(xdata); + uuid_copy(local->loc->gfid, buf->ia_gfid); + + STACK_WIND(frame, + load_file_size, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->getxattr, + local->loc, + FSIZE_XATTR_PREFIX, + NULL); + return 0; + unwind: + loc_wipe(local->loc); + GF_FREE(local->loc); + STACK_UNWIND_STRICT(lookup, + frame, + op_ret, + op_errno, + inode, + buf, + xdata, + postparent); + return 0; +} + +static int32_t crypt_lookup(call_frame_t *frame, + xlator_t *this, + loc_t *loc, dict_t *xdata) +{ + int32_t ret; + crypt_local_t *local; + + local = crypt_alloc_local(frame, this, GF_FOP_LOOKUP); + if (!local) + goto error; + local->loc = GF_CALLOC(1, sizeof(*loc), gf_crypt_mt_loc); + if (!local->loc) + goto error; + memset(local->loc, 0, sizeof(*local->loc)); + ret = loc_copy(local->loc, loc); + if (ret) { + GF_FREE(local->loc); + goto error; + } + gf_log(this->name, GF_LOG_DEBUG, "Lookup %s", loc->path); + STACK_WIND(frame, + crypt_lookup_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->lookup, + loc, + xdata); + return 0; + error: + STACK_UNWIND_STRICT(lookup, + frame, + -1, + ENOMEM, + NULL, + NULL, + NULL, + NULL); + return 0; +} + +/* + * for every regular directory entry find its real file size + * and update stat's buf properly + */ +static int32_t crypt_readdirp_cbk(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + gf_dirent_t *entries, dict_t *xdata) +{ + gf_dirent_t *entry = NULL; + + if (op_ret < 0) + goto unwind; + + list_for_each_entry (entry, (&entries->list), list) { + data_t *data; + + if (!IA_ISREG(entry->d_stat.ia_type)) + continue; + data = dict_get(entry->dict, FSIZE_XATTR_PREFIX); + if (!data){ + gf_log("crypt", GF_LOG_WARNING, + "Regular file size of direntry not found"); + op_errno = EIO; + op_ret = -1; + break; + } + entry->d_stat.ia_size = data_to_uint64(data); + } + unwind: + STACK_UNWIND_STRICT(readdirp, frame, op_ret, op_errno, entries, xdata); + return 0; +} + +/* + * ->readdirp() fills in-core inodes, so we need to set proper + * file sizes for all directory entries of the parent @fd. + * Actual updates take place in ->crypt_readdirp_cbk() + */ +static int32_t crypt_readdirp(call_frame_t *frame, xlator_t *this, + fd_t *fd, size_t size, off_t offset, + dict_t *xdata) +{ + int32_t ret = ENOMEM; + + if (!xdata) { + xdata = dict_new(); + if (!xdata) + goto error; + } + else + dict_ref(xdata); + /* + * make sure that we'll have real file sizes at ->readdirp_cbk() + */ + ret = dict_set(xdata, FSIZE_XATTR_PREFIX, data_from_uint64(0)); + if (ret) { + dict_unref(xdata); + goto error; + } + STACK_WIND(frame, + crypt_readdirp_cbk, + FIRST_CHILD(this), + FIRST_CHILD(this)->fops->readdirp, + fd, + size, + offset, + xdata); + dict_unref(xdata); + return 0; + error: + STACK_UNWIND_STRICT(readdirp, frame, -1, ret, NULL, NULL); + return 0; +} + +static int32_t crypt_access(call_frame_t *frame, + xlator_t *this, + loc_t *loc, + int32_t mask, dict_t *xdata) +{ + gf_log(this->name, GF_LOG_WARNING, + "NFS mounts of encrypted volumes are unsupported"); + STACK_UNWIND_STRICT(access, frame, -1, EPERM, NULL); + return 0; +} + +int32_t master_set_block_size (xlator_t *this, crypt_private_t *priv, + dict_t *options) +{ + uint64_t block_size = 0; + struct master_cipher_info *master = get_master_cinfo(priv); + + if (options != NULL) + GF_OPTION_RECONF("block-size", block_size, options, + size, error); + else + GF_OPTION_INIT("block-size", block_size, size, error); + + switch (block_size) { + case 512: + master->m_block_bits = 9; + break; + case 1024: + master->m_block_bits = 10; + break; + case 2048: + master->m_block_bits = 11; + break; + case 4096: + master->m_block_bits = 12; + break; + default: + gf_log("crypt", GF_LOG_ERROR, + "FATAL: unsupported block size %llu", + (unsigned long long)block_size); + goto error; + } + return 0; + error: + return -1; +} + +int32_t master_set_alg(xlator_t *this, crypt_private_t *priv) +{ + struct master_cipher_info *master = get_master_cinfo(priv); + master->m_alg = AES_CIPHER_ALG; + return 0; +} + +int32_t master_set_mode(xlator_t *this, crypt_private_t *priv) +{ + struct master_cipher_info *master = get_master_cinfo(priv); + master->m_mode = XTS_CIPHER_MODE; + return 0; +} + +/* + * set key size in bits to the master info + * Pre-conditions: cipher mode in the master info is uptodate. + */ +static int master_set_data_key_size (xlator_t *this, crypt_private_t *priv, + dict_t *options) +{ + int32_t ret; + uint64_t key_size = 0; + struct master_cipher_info *master = get_master_cinfo(priv); + + if (options != NULL) + GF_OPTION_RECONF("data-key-size", key_size, options, + size, error); + else + GF_OPTION_INIT("data-key-size", key_size, size, error); + + ret = data_cipher_algs[master->m_alg][master->m_mode].check_key(key_size); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, + "FATAL: wrong bin key size %llu for alg %d mode %d", + (unsigned long long)key_size, + (int)master->m_alg, + (int)master->m_mode); + goto error; + } + master->m_dkey_size = key_size; + return 0; + error: + return -1; +} + +static int is_hex(char *s) { + return ('0' <= *s && *s <= '9') || ('a' <= *s && *s <= 'f'); +} + +static int parse_hex_buf(xlator_t *this, char *src, unsigned char *dst, + int hex_size) +{ + int i; + int hex_byte = 0; + + for (i = 0; i < (hex_size / 2); i++) { + if (!is_hex(src + i*2) || !is_hex(src + i*2 + 1)) { + gf_log("crypt", GF_LOG_ERROR, + "FATAL: not hex symbol in key"); + return -1; + } + if (sscanf(src + i*2, "%2x", &hex_byte) != 1) { + gf_log("crypt", GF_LOG_ERROR, + "FATAL: can not parse hex key"); + return -1; + } + dst[i] = hex_byte & 0xff; + } + return 0; +} + +/* + * Parse options; + * install master volume key + */ +int32_t master_set_master_vol_key(xlator_t *this, crypt_private_t *priv) +{ + int32_t ret; + FILE *file = NULL; + + int32_t key_size; + char *opt_key_file_pathname = NULL; + + unsigned char bin_buf[MASTER_VOL_KEY_SIZE]; + char hex_buf[2 * MASTER_VOL_KEY_SIZE]; + + struct master_cipher_info *master = get_master_cinfo(priv); + /* + * extract master key passed via option + */ + GF_OPTION_INIT("master-key", opt_key_file_pathname, path, bad_key); + + if (!opt_key_file_pathname) { + gf_log(this->name, GF_LOG_ERROR, "FATAL: missing master key"); + return -1; + } + gf_log(this->name, GF_LOG_DEBUG, "handling file key %s", + opt_key_file_pathname); + + file = fopen(opt_key_file_pathname, "r"); + if (file == NULL) { + gf_log(this->name, GF_LOG_ERROR, + "FATAL: can not open file with master key"); + return -1; + } + /* + * extract hex key + */ + key_size = fread(hex_buf, 1, sizeof(hex_buf), file); + if (key_size < sizeof(hex_buf)) { + gf_log(this->name, GF_LOG_ERROR, + "FATAL: master key is too short"); + goto bad_key; + } + ret = parse_hex_buf(this, hex_buf, bin_buf, key_size); + if (ret) + goto bad_key; + memcpy(master->m_key, bin_buf, MASTER_VOL_KEY_SIZE); + memset(hex_buf, 0, sizeof(hex_buf)); + fclose(file); + + memset(bin_buf, 0, sizeof(bin_buf)); + return 0; + bad_key: + gf_log(this->name, GF_LOG_ERROR, "FATAL: bad master key"); + if (file) + fclose(file); + memset(bin_buf, 0, sizeof(bin_buf)); + return -1; +} + +/* + * Derive volume key for object-id authentication + */ +int32_t master_set_nmtd_vol_key(xlator_t *this, crypt_private_t *priv) +{ + return get_nmtd_vol_key(get_master_cinfo(priv)); +} + +int32_t crypt_init_xlator(xlator_t *this) +{ + int32_t ret; + crypt_private_t *priv = this->private; + + ret = master_set_alg(this, priv); + if (ret) + return ret; + ret = master_set_mode(this, priv); + if (ret) + return ret; + ret = master_set_block_size(this, priv, NULL); + if (ret) + return ret; + ret = master_set_data_key_size(this, priv, NULL); + if (ret) + return ret; + ret = master_set_master_vol_key(this, priv); + if (ret) + return ret; + return master_set_nmtd_vol_key(this, priv); +} + +static int32_t crypt_alloc_private(xlator_t *this) +{ + this->private = GF_CALLOC(1, sizeof(crypt_private_t), gf_crypt_mt_priv); + if (!this->private) { + gf_log("crypt", GF_LOG_ERROR, + "Can not allocate memory for private data"); + return ENOMEM; + } + return 0; +} + +static void crypt_free_private(xlator_t *this) +{ + crypt_private_t *priv = this->private; + if (priv) { + memset(priv, 0, sizeof(*priv)); + GF_FREE(priv); + } +} + +int32_t reconfigure (xlator_t *this, dict_t *options) +{ + int32_t ret = -1; + crypt_private_t *priv = NULL; + + GF_VALIDATE_OR_GOTO ("crypt", this, error); + GF_VALIDATE_OR_GOTO (this->name, this->private, error); + GF_VALIDATE_OR_GOTO (this->name, options, error); + + priv = this->private; + + ret = master_set_block_size(this, priv, options); + if (ret) { + gf_log("this->name", GF_LOG_ERROR, + "Failed to reconfure block size"); + goto error; + } + ret = master_set_data_key_size(this, priv, options); + if (ret) { + gf_log("this->name", GF_LOG_ERROR, + "Failed to reconfure data key size"); + goto error; + } + return 0; + error: + return ret; +} + +int32_t init(xlator_t *this) +{ + int32_t ret; + + if (!this->children || this->children->next) { + gf_log ("crypt", GF_LOG_ERROR, + "FATAL: crypt should have exactly one child"); + return EINVAL; + } + if (!this->parents) { + gf_log (this->name, GF_LOG_WARNING, + "dangling volume. check volfile "); + } + ret = crypt_alloc_private(this); + if (ret) + return ret; + ret = crypt_init_xlator(this); + if (ret) + goto error; + this->local_pool = mem_pool_new(crypt_local_t, 64); + if (!this->local_pool) { + gf_log(this->name, GF_LOG_ERROR, + "failed to create local_t's memory pool"); + ret = ENOMEM; + goto error; + } + gf_log ("crypt", GF_LOG_INFO, "crypt xlator loaded"); + return 0; + error: + crypt_free_private(this); + return ret; +} + +void fini (xlator_t *this) +{ + crypt_free_private(this); +} + +struct xlator_fops fops = { + .readv = crypt_readv, + .writev = crypt_writev, + .truncate = crypt_truncate, + .ftruncate = crypt_ftruncate, + .setxattr = crypt_setxattr, + .fsetxattr = crypt_fsetxattr, + .link = crypt_link, + .unlink = crypt_unlink, + .rename = crypt_rename, + .open = crypt_open, + .create = crypt_create, + .stat = crypt_stat, + .fstat = crypt_fstat, + .lookup = crypt_lookup, + .readdirp = crypt_readdirp, + .access = crypt_access +}; + +struct xlator_cbks cbks = { + .forget = crypt_forget +}; + +struct volume_options options[] = { + { .key = {"master-key"}, + .type = GF_OPTION_TYPE_PATH, + .description = "Pathname of regular file which contains master volume key" + }, + { .key = {"data-key-size"}, + .type = GF_OPTION_TYPE_SIZET, + .description = "Data key size (bits)", + .min = 256, + .max = 512, + .default_value = "256", + }, + { .key = {"block-size"}, + .type = GF_OPTION_TYPE_SIZET, + .description = "Atom size (bits)", + .min = 512, + .max = 4096, + .default_value = "4096" + }, + { .key = {NULL} }, +}; + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/crypt.h b/xlators/encryption/crypt/src/crypt.h new file mode 100644 index 00000000000..01a8542ab8c --- /dev/null +++ b/xlators/encryption/crypt/src/crypt.h @@ -0,0 +1,899 @@ +/* + Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef __CRYPT_H__ +#define __CRYPT_H__ + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif +#include <openssl/aes.h> +#include <openssl/evp.h> +#include <openssl/sha.h> +#include <openssl/hmac.h> +#include <openssl/cmac.h> +#include <openssl/modes.h> +#include "crypt-mem-types.h" + +#define CRYPT_XLATOR_ID (0) + +#define MAX_IOVEC_BITS (3) +#define MAX_IOVEC (1 << MAX_IOVEC_BITS) +#define KEY_FACTOR_BITS (6) + +#define DEBUG_CRYPT (0) +#define TRIVIAL_TFM (0) + +#define CRYPT_MIN_BLOCK_BITS (9) +#define CRYPT_MAX_BLOCK_BITS (12) + +#define MASTER_VOL_KEY_SIZE (32) +#define NMTD_VOL_KEY_SIZE (16) + +struct crypt_key { + uint32_t len; + const char *label; +}; + +/* + * Add new key types to the end of this + * enumeration but before LAST_KEY_TYPE + */ +typedef enum { + MASTER_VOL_KEY, + NMTD_VOL_KEY, + NMTD_LINK_KEY, + EMTD_FILE_KEY, + DATA_FILE_KEY_256, + DATA_FILE_KEY_512, + LAST_KEY_TYPE +}crypt_key_type; + +struct kderive_context { + const unsigned char *pkey;/* parent key */ + uint32_t pkey_len; /* parent key size, bits */ + uint32_t ckey_len; /* child key size, bits */ + unsigned char *fid; /* fixed input data, NIST 800-108, 5.1 */ + uint32_t fid_len; /* fid len, bytes */ + unsigned char *out; /* contains child keying material */ + uint32_t out_len; /* out len, bytes */ +}; + +typedef enum { + DATA_ATOM, + HOLE_ATOM, + LAST_DATA_TYPE +}atom_data_type; + +typedef enum { + HEAD_ATOM, + TAIL_ATOM, + FULL_ATOM, + LAST_LOCALITY_TYPE +}atom_locality_type; + +typedef enum { + MTD_CREATE, + MTD_APPEND, + MTD_OVERWRITE, + MTD_CUT, + MTD_LAST_OP +} mtd_op_t; + +struct xts128_context { + void *key1, *key2; + block128_f block1,block2; +}; + +struct object_cipher_info { + cipher_alg_t o_alg; + cipher_mode_t o_mode; + uint32_t o_block_bits; + uint32_t o_dkey_size; /* raw data key size in bits */ + union { + struct { + unsigned char ivec[16]; + AES_KEY dkey[2]; + AES_KEY tkey; /* key used for tweaking */ + XTS128_CONTEXT xts; + } aes_xts; + } u; +}; + +struct master_cipher_info { + /* + * attributes inherited by newly created regular files + */ + cipher_alg_t m_alg; + cipher_mode_t m_mode; + uint32_t m_block_bits; + uint32_t m_dkey_size; /* raw key size in bits */ + /* + * master key + */ + unsigned char m_key[MASTER_VOL_KEY_SIZE]; + /* + * volume key for oid authentication + */ + unsigned char m_nmtd_key[NMTD_VOL_KEY_SIZE]; +}; + +/* +* This info is not changed during file's life + */ +struct crypt_inode_info { +#if DEBUG_CRYPT + loc_t *loc; /* pathname that the file has been + opened, or created with */ +#endif + uint16_t nr_minor; + uuid_t oid; + struct object_cipher_info cinfo; +}; + +/* + * this should locate in secure memory + */ +typedef struct { + struct master_cipher_info master; +} crypt_private_t; + +static inline struct master_cipher_info *get_master_cinfo(crypt_private_t *priv) +{ + return &priv->master; +} + +static inline struct object_cipher_info *get_object_cinfo(struct crypt_inode_info + *info) +{ + return &info->cinfo; +} + +/* + * this describes layouts and properties + * of atoms in an aligned vector + */ +struct avec_config { + uint32_t atom_size; + atom_data_type type; + size_t orig_size; + off_t orig_offset; + size_t expanded_size; + off_t aligned_offset; + + uint32_t off_in_head; + uint32_t off_in_tail; + uint32_t gap_in_tail; + uint32_t nr_full_blocks; + + struct iovec *avec; /* aligned vector */ + uint32_t acount; /* number of avec components. The same + * as number of occupied logical blocks */ + char **pool; + uint32_t blocks_in_pool; + uint32_t cursor; /* makes sense only for ordered writes, + * so there is no races on this counter. + * + * Cursor is per-config object, we don't + * reset cursor for atoms of different + * localities (head, tail, full) + */ +}; + + +typedef struct { + glusterfs_fop_t fop; /* code of FOP this local info built for */ + fd_t *fd; + inode_t *inode; + loc_t *loc; + int32_t mac_idx; + loc_t *newloc; + int32_t flags; + int32_t wbflags; + struct crypt_inode_info *info; + struct iobref *iobref; + struct iobref *iobref_data; + off_t offset; + + uint64_t old_file_size; /* per FOP, retrieved under lock held */ + uint64_t cur_file_size; /* per iteration, before issuing IOs */ + uint64_t new_file_size; /* per iteration, after issuing IOs */ + + uint64_t io_offset; /* offset of IOs issued per iteration */ + uint64_t io_offset_nopad; /* offset of user's data in the atom */ + uint32_t io_size; /* size of IOs issued per iteration */ + uint32_t io_size_nopad; /* size of user's data in the IOs */ + uint32_t eof_padding_size; /* size od EOF padding in the IOs */ + + gf_lock_t call_lock; /* protect nr_calls from many cbks */ + int32_t nr_calls; + + atom_data_type active_setup; /* which setup (hole or date) + is currently active */ + /* data setup */ + struct avec_config data_conf; + + /* hole setup */ + int hole_conv_in_proggress; + gf_lock_t hole_lock; /* protect hole config from many cbks */ + int hole_handled; + struct avec_config hole_conf; + struct iatt buf; + struct iatt prebuf; + struct iatt postbuf; + struct iatt *prenewparent; + struct iatt *postnewparent; + int32_t op_ret; + int32_t op_errno; + int32_t rw_count; /* total read or written */ + gf_lock_t rw_count_lock; /* protect the counter above */ + unsigned char *format; /* for create, update format string */ + uint32_t format_size; + uint32_t msgflags; /* messages for crypt_open() */ + dict_t *xdata; + dict_t *xattr; + struct iovec vec; /* contains last file's atom for + read-prune-write sequence */ + gf_boolean_t custom_mtd; + /* + * the next 3 fields are used by readdir and friends + */ + gf_dirent_t *de; /* directory entry */ + char *de_path; /* pathname of directory entry */ + uint32_t de_prefix_len; /* lenght of the parent's pathname */ + gf_dirent_t *entries; + + uint32_t update_disk_file_size:1; +} crypt_local_t; + +/* This represents a (read)modify-write atom */ +struct rmw_atom { + atom_locality_type locality; + /* + * read-modify-write sequence of the atom + */ + int32_t (*rmw)(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iovec *vec, + int32_t count, + struct iatt *stbuf, + struct iobref *iobref, + dict_t *xdata); + /* + * offset of the logical block in a file + */ + loff_t (*offset_at)(call_frame_t *frame, + struct object_cipher_info *object); + /* + * IO offset in an atom + */ + uint32_t (*offset_in)(call_frame_t *frame, + struct object_cipher_info *object); + /* + * number of bytes of plain text of this atom that user + * wants to read/write. + * It can be smaller than atom_size in the case of head + * or tail atoms. + */ + uint32_t (*io_size_nopad)(call_frame_t *frame, + struct object_cipher_info *object); + /* + * which iovec represents the atom + */ + struct iovec *(*get_iovec)(call_frame_t *frame, uint32_t count); + /* + * how many bytes of partial block should be uptodated by + * reading from disk. + * This is used to perform a read component of RMW (read-modify-write). + */ + uint32_t (*count_to_uptodate)(call_frame_t *frame, struct object_cipher_info *object); + struct avec_config *(*get_config)(call_frame_t *frame); +}; + +struct data_cipher_alg { + gf_boolean_t atomic; /* true means that algorithm requires + to pad data before cipher transform */ + gf_boolean_t should_pad; /* true means that algorithm requires + to pad the end of file with extra-data */ + uint32_t blkbits; /* blksize = 1 << blkbits */ + /* + * any preliminary sanity checks goes here + */ + int32_t (*init)(void); + /* + * set alg-mode specific inode info + */ + int32_t (*set_private)(struct crypt_inode_info *info, + struct master_cipher_info *master); + /* + * check alg-mode specific data key + */ + int32_t (*check_key)(uint32_t key_size); + void (*set_iv)(off_t offset, struct object_cipher_info *object); + int32_t (*encrypt)(const unsigned char *from, unsigned char *to, + size_t length, off_t offset, const int enc, + struct object_cipher_info *object); +}; + +/* + * version-dependent metadata loader + */ +struct crypt_mtd_loader { + /* + * return core format size + */ + size_t (*format_size)(mtd_op_t op, size_t old_size); + /* + * pack version-specific metadata of an object + * at ->create() + */ + int32_t (*create_format)(unsigned char *wire, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master); + /* + * extract version-specific metadata of an object + * at ->open() time + */ + int32_t (*open_format)(unsigned char *wire, + int32_t len, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local, + gf_boolean_t load_info); + int32_t (*update_format)(unsigned char *new, + unsigned char *old, + size_t old_len, + int32_t mac_idx, + mtd_op_t op, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local); +}; + +typedef int32_t (*end_writeback_handler_t)(call_frame_t *frame, + void *cookie, + xlator_t *this, + int32_t op_ret, + int32_t op_errno, + struct iatt *prebuf, + struct iatt *postbuf, + dict_t *xdata); +typedef void (*linkop_wind_handler_t)(call_frame_t *frame, xlator_t *this); +typedef void (*linkop_unwind_handler_t)(call_frame_t *frame); + + +/* Declarations */ + +/* keys.c */ +extern struct crypt_key crypt_keys[LAST_KEY_TYPE]; +int32_t get_nmtd_vol_key(struct master_cipher_info *master); +int32_t get_nmtd_link_key(loc_t *loc, + struct master_cipher_info *master, + unsigned char *result); +int32_t get_emtd_file_key(struct crypt_inode_info *info, + struct master_cipher_info *master, + unsigned char *result); +int32_t get_data_file_key(struct crypt_inode_info *info, + struct master_cipher_info *master, + uint32_t keysize, + unsigned char *key); +/* data.c */ +extern struct data_cipher_alg data_cipher_algs[LAST_CIPHER_ALG][LAST_CIPHER_MODE]; +void encrypt_aligned_iov(struct object_cipher_info *object, + struct iovec *vec, + int count, + off_t off); +void decrypt_aligned_iov(struct object_cipher_info *object, + struct iovec *vec, + int count, + off_t off); +int32_t align_iov_by_atoms(xlator_t *this, + crypt_local_t *local, + struct object_cipher_info *object, + struct iovec *vec /* input vector */, + int32_t count /* number of vec components */, + struct iovec *avec /* aligned vector */, + char **blocks /* pool of blocks */, + uint32_t *blocks_allocated, + struct avec_config *conf); +int32_t set_config_avec_data(xlator_t *this, + crypt_local_t *local, + struct avec_config *conf, + struct object_cipher_info *object, + struct iovec *vec, + int32_t vec_count); +int32_t set_config_avec_hole(xlator_t *this, + crypt_local_t *local, + struct avec_config *conf, + struct object_cipher_info *object, + glusterfs_fop_t fop); +void set_gap_at_end(call_frame_t *frame, struct object_cipher_info *object, + struct avec_config *conf, atom_data_type dtype); +void set_config_offsets(call_frame_t *frame, + xlator_t *this, + uint64_t offset, + uint64_t count, + atom_data_type dtype, + int32_t setup_gap_in_tail); + +/* metadata.c */ +extern struct crypt_mtd_loader mtd_loaders [LAST_MTD_LOADER]; + +int32_t alloc_format(crypt_local_t *local, size_t size); +int32_t alloc_format_create(crypt_local_t *local); +void free_format(crypt_local_t *local); +size_t format_size(mtd_op_t op, size_t old_size); +size_t new_format_size(void); +int32_t open_format(unsigned char *str, int32_t len, loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, crypt_local_t *local, + gf_boolean_t load_info); +int32_t update_format(unsigned char *new, unsigned char *old, + size_t old_len, int32_t mac_idx, mtd_op_t op, loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local); +int32_t create_format(unsigned char *wire, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master); + +/* atom.c */ +struct rmw_atom *atom_by_types(atom_data_type data, + atom_locality_type locality); +void submit_partial(call_frame_t *frame, + xlator_t *this, + fd_t *fd, + atom_locality_type ltype); +void submit_full(call_frame_t *frame, xlator_t *this); + +/* crypt.c */ + +end_writeback_handler_t dispatch_end_writeback(glusterfs_fop_t fop); +static size_t iovec_get_size(struct iovec *vec, uint32_t count); +void set_local_io_params_writev(call_frame_t *frame, + struct object_cipher_info *object, + struct rmw_atom *atom, off_t io_offset, + uint32_t io_size); +void link_wind(call_frame_t *frame, xlator_t *this); +void unlink_wind(call_frame_t *frame, xlator_t *this); +void link_unwind(call_frame_t *frame); +void unlink_unwind(call_frame_t *frame); +void rename_wind(call_frame_t *frame, xlator_t *this); +void rename_unwind(call_frame_t *frame); + +/* Inline functions */ + +static inline size_t iovec_get_size(struct iovec *vec, uint32_t count) +{ + int i; + size_t size = 0; + for (i = 0; i < count; i++) + size += vec[i].iov_len; + return size; +} + +static inline int32_t crypt_xlator_id(void) +{ + return CRYPT_XLATOR_ID; +} + +static inline mtd_loader_id current_mtd_loader(void) +{ + return MTD_LOADER_V1; +} + +static inline uint32_t master_key_size (void) +{ + return crypt_keys[MASTER_VOL_KEY].len >> 3; +} + +static inline uint32_t nmtd_vol_key_size (void) +{ + return crypt_keys[NMTD_VOL_KEY].len >> 3; +} + +static inline uint32_t alg_mode_blkbits(cipher_alg_t alg, + cipher_mode_t mode) +{ + return data_cipher_algs[alg][mode].blkbits; +} + +static inline uint32_t alg_mode_blksize(cipher_alg_t alg, + cipher_mode_t mode) +{ + return 1 << alg_mode_blkbits(alg, mode); +} + +static inline gf_boolean_t alg_mode_atomic(cipher_alg_t alg, + cipher_mode_t mode) +{ + return data_cipher_algs[alg][mode].atomic; +} + +static inline gf_boolean_t alg_mode_should_pad(cipher_alg_t alg, + cipher_mode_t mode) +{ + return data_cipher_algs[alg][mode].should_pad; +} + +static inline uint32_t master_alg_blksize(struct master_cipher_info *mr) +{ + return alg_mode_blksize(mr->m_alg, mr->m_mode); +} + +static inline uint32_t master_alg_blkbits(struct master_cipher_info *mr) +{ + return alg_mode_blkbits(mr->m_alg, mr->m_mode); +} + +static inline gf_boolean_t master_alg_atomic(struct master_cipher_info *mr) +{ + return alg_mode_atomic(mr->m_alg, mr->m_mode); +} + +static inline gf_boolean_t master_alg_should_pad(struct master_cipher_info *mr) +{ + return alg_mode_should_pad(mr->m_alg, mr->m_mode); +} + +static inline uint32_t object_alg_blksize(struct object_cipher_info *ob) +{ + return alg_mode_blksize(ob->o_alg, ob->o_mode); +} + +static inline uint32_t object_alg_blkbits(struct object_cipher_info *ob) +{ + return alg_mode_blkbits(ob->o_alg, ob->o_mode); +} + +static inline gf_boolean_t object_alg_atomic(struct object_cipher_info *ob) +{ + return alg_mode_atomic(ob->o_alg, ob->o_mode); +} + +static inline gf_boolean_t object_alg_should_pad(struct object_cipher_info *ob) +{ + return alg_mode_should_pad(ob->o_alg, ob->o_mode); +} + +static inline uint32_t aes_raw_key_size(struct master_cipher_info *master) +{ + return master->m_dkey_size >> 3; +} + +static inline struct avec_config *get_hole_conf(call_frame_t *frame) +{ + return &(((crypt_local_t *)frame->local)->hole_conf); +} + +static inline struct avec_config *get_data_conf(call_frame_t *frame) +{ + return &(((crypt_local_t *)frame->local)->data_conf); +} + +static inline int32_t get_atom_bits (struct object_cipher_info *object) +{ + return object->o_block_bits; +} + +static inline int32_t get_atom_size (struct object_cipher_info *object) +{ + return 1 << get_atom_bits(object); +} + +static inline int32_t has_head_block(struct avec_config *conf) +{ + return conf->off_in_head || + (conf->acount == 1 && conf->off_in_tail); +} + +static inline int32_t has_tail_block(struct avec_config *conf) +{ + return conf->off_in_tail && conf->acount > 1; +} + +static inline int32_t has_full_blocks(struct avec_config *conf) +{ + return conf->nr_full_blocks; +} + +static inline int32_t should_submit_head_block(struct avec_config *conf) +{ + return has_head_block(conf) && (conf->cursor == 0); +} + +static inline int32_t should_submit_tail_block(struct avec_config *conf) +{ + return has_tail_block(conf) && (conf->cursor == conf->acount - 1); +} + +static inline int32_t should_submit_full_block(struct avec_config *conf) +{ + uint32_t start = has_head_block(conf) ? 1 : 0; + + return has_full_blocks(conf) && + conf->cursor >= start && + conf->cursor < start + conf->nr_full_blocks; +} + +#if DEBUG_CRYPT +static inline void crypt_check_input_len(size_t len, + struct object_cipher_info *object) +{ + if (object_alg_should_pad(object) && (len & (object_alg_blksize(object) - 1))) + gf_log ("crypt", GF_LOG_DEBUG, "bad input len: %d", (int)len); +} + +static inline void check_head_block(struct avec_config *conf) +{ + if (!has_head_block(conf)) + gf_log("crypt", GF_LOG_DEBUG, "not a head atom"); +} + +static inline void check_tail_block(struct avec_config *conf) +{ + if (!has_tail_block(conf)) + gf_log("crypt", GF_LOG_DEBUG, "not a tail atom"); +} + +static inline void check_full_block(struct avec_config *conf) +{ + if (!has_full_blocks(conf)) + gf_log("crypt", GF_LOG_DEBUG, "not a full atom"); +} + +static inline void check_cursor_head(struct avec_config *conf) +{ + if (!has_head_block(conf)) + gf_log("crypt", + GF_LOG_DEBUG, "Illegal call of head atom method"); + else if (conf->cursor != 0) + gf_log("crypt", + GF_LOG_DEBUG, "Cursor (%d) is not at head atom", + conf->cursor); +} + +static inline void check_cursor_full(struct avec_config *conf) +{ + if (!has_full_blocks(conf)) + gf_log("crypt", + GF_LOG_DEBUG, "Illegal call of full atom method"); + if (has_head_block(conf) && (conf->cursor == 0)) + gf_log("crypt", + GF_LOG_DEBUG, "Cursor is not at full atom"); +} + +/* + * FIXME: use avec->iov_len to check setup + */ +static inline int data_local_invariant(crypt_local_t *local) +{ + return 0; +} + +#else +#define crypt_check_input_len(len, object) noop +#define check_head_block(conf) noop +#define check_tail_block(conf) noop +#define check_full_block(conf) noop +#define check_cursor_head(conf) noop +#define check_cursor_full(conf) noop + +#endif /* DEBUG_CRYPT */ + +static inline struct avec_config *conf_by_type(call_frame_t *frame, + atom_data_type dtype) +{ + struct avec_config *conf = NULL; + + switch (dtype) { + case HOLE_ATOM: + conf = get_hole_conf(frame); + break; + case DATA_ATOM: + conf = get_data_conf(frame); + break; + default: + gf_log("crypt", GF_LOG_DEBUG, "bad atom type"); + } + return conf; +} + +static inline uint32_t nr_calls_head(struct avec_config *conf) +{ + return has_head_block(conf) ? 1 : 0; +} + +static inline uint32_t nr_calls_tail(struct avec_config *conf) +{ + return has_tail_block(conf) ? 1 : 0; +} + +static inline uint32_t nr_calls_full(struct avec_config *conf) +{ + switch(conf->type) { + case HOLE_ATOM: + return has_full_blocks(conf); + case DATA_ATOM: + return has_full_blocks(conf) ? + logical_blocks_occupied(0, + conf->nr_full_blocks, + MAX_IOVEC_BITS) : 0; + default: + gf_log("crypt", GF_LOG_DEBUG, "bad atom data type"); + return 0; + } +} + +static inline uint32_t nr_calls(struct avec_config *conf) +{ + return nr_calls_head(conf) + nr_calls_tail(conf) + nr_calls_full(conf); +} + +static inline uint32_t nr_calls_data(call_frame_t *frame) +{ + return nr_calls(get_data_conf(frame)); +} + +static inline uint32_t nr_calls_hole(call_frame_t *frame) +{ + return nr_calls(get_hole_conf(frame)); +} + +static inline void get_one_call_nolock(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + + ++local->nr_calls; + + //gf_log("crypt", GF_LOG_DEBUG, "get %d calls", 1); +} + +static inline void get_one_call(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + + LOCK(&local->call_lock); + get_one_call_nolock(frame); + UNLOCK(&local->call_lock); +} + +static inline void get_nr_calls_nolock(call_frame_t *frame, int32_t nr) +{ + crypt_local_t *local = frame->local; + + local->nr_calls += nr; + + //gf_log("crypt", GF_LOG_DEBUG, "get %d calls", nr); +} + +static inline void get_nr_calls(call_frame_t *frame, int32_t nr) +{ + crypt_local_t *local = frame->local; + + LOCK(&local->call_lock); + get_nr_calls_nolock(frame, nr); + UNLOCK(&local->call_lock); +} + +static inline int put_one_call(crypt_local_t *local) +{ + uint32_t last = 0; + + LOCK(&local->call_lock); + if (--local->nr_calls == 0) + last = 1; + + //gf_log("crypt", GF_LOG_DEBUG, "put %d calls", 1); + + UNLOCK(&local->call_lock); + return last; +} + +static inline int is_appended_write(call_frame_t *frame) +{ + crypt_local_t *local = frame->local; + struct avec_config *conf = get_data_conf(frame); + + return conf->orig_offset + conf->orig_size > local->old_file_size; +} + +static inline int is_ordered_mode(call_frame_t *frame) +{ +#if 0 + crypt_local_t *local = frame->local; + return local->fop == GF_FOP_FTRUNCATE || + (local->fop == GF_FOP_WRITE && is_appended_write(frame)); +#endif + return 1; +} + +static inline int32_t hole_conv_completed(crypt_local_t *local) +{ + struct avec_config *conf = &local->hole_conf; + return conf->cursor == conf->acount; +} + +static inline int32_t data_write_in_progress(crypt_local_t *local) +{ + return local->active_setup == DATA_ATOM; +} + +static inline int32_t parent_is_crypt_xlator(call_frame_t *frame, + xlator_t *this) +{ + return frame->parent->this == this; +} + +static inline linkop_wind_handler_t linkop_wind_dispatch(glusterfs_fop_t fop) +{ + switch(fop){ + case GF_FOP_LINK: + return link_wind; + case GF_FOP_UNLINK: + return unlink_wind; + case GF_FOP_RENAME: + return rename_wind; + default: + gf_log("crypt", GF_LOG_ERROR, "Bad link operation %d", fop); + return NULL; + } +} + +static inline linkop_unwind_handler_t linkop_unwind_dispatch(glusterfs_fop_t fop) +{ + switch(fop){ + case GF_FOP_LINK: + return link_unwind; + case GF_FOP_UNLINK: + return unlink_unwind; + case GF_FOP_RENAME: + return rename_unwind; + default: + gf_log("crypt", GF_LOG_ERROR, "Bad link operation %d", fop); + return NULL; + } +} + +static inline mtd_op_t linkop_mtdop_dispatch(glusterfs_fop_t fop) +{ + switch (fop) { + case GF_FOP_LINK: + return MTD_APPEND; + case GF_FOP_UNLINK: + return MTD_CUT; + case GF_FOP_RENAME: + return MTD_OVERWRITE; + default: + gf_log("crypt", GF_LOG_WARNING, "Bad link operation %d", fop); + return MTD_LAST_OP; + } +} + +#endif /* __CRYPT_H__ */ + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/data.c b/xlators/encryption/crypt/src/data.c new file mode 100644 index 00000000000..762fa554ac2 --- /dev/null +++ b/xlators/encryption/crypt/src/data.c @@ -0,0 +1,769 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif + +#include "defaults.h" +#include "crypt-common.h" +#include "crypt.h" + +static void set_iv_aes_xts(off_t offset, struct object_cipher_info *object) +{ + unsigned char *ivec; + + ivec = object->u.aes_xts.ivec; + + /* convert the tweak into a little-endian byte + * array (IEEE P1619/D16, May 2007, section 5.1) + */ + + *((uint64_t *)ivec) = htole64(offset); + + /* ivec is padded with zeroes */ +} + +static int32_t aes_set_keys_common(unsigned char *raw_key, uint32_t key_size, + AES_KEY *keys) +{ + int32_t ret; + + ret = AES_set_encrypt_key(raw_key, + key_size, + &keys[AES_ENCRYPT]); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Set encrypt key failed"); + return ret; + } + ret = AES_set_decrypt_key(raw_key, + key_size, + &keys[AES_DECRYPT]); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Set decrypt key failed"); + return ret; + } + return 0; +} + +/* + * set private cipher info for xts mode + */ +static int32_t set_private_aes_xts(struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int ret; + struct object_cipher_info *object = get_object_cinfo(info); + unsigned char *data_key; + uint32_t subkey_size; + + /* init tweak value */ + memset(object->u.aes_xts.ivec, 0, 16); + + data_key = GF_CALLOC(1, object->o_dkey_size, gf_crypt_mt_key); + if (!data_key) + return ENOMEM; + + /* + * retrieve data keying meterial + */ + ret = get_data_file_key(info, master, object->o_dkey_size, data_key); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Failed to retrieve data key"); + GF_FREE(data_key); + return ret; + } + /* + * parse compound xts key + */ + subkey_size = object->o_dkey_size >> 4; /* (xts-key-size-in-bytes / 2) */ + /* + * install key for data encryption + */ + ret = aes_set_keys_common(data_key, + subkey_size << 3, object->u.aes_xts.dkey); + if (ret) { + GF_FREE(data_key); + return ret; + } + /* + * set up key used to encrypt tweaks + */ + ret = AES_set_encrypt_key(data_key + subkey_size, + object->o_dkey_size / 2, + &object->u.aes_xts.tkey); + if (ret < 0) + gf_log("crypt", GF_LOG_ERROR, "Set tweak key failed"); + + GF_FREE(data_key); + return ret; +} + +static int32_t aes_xts_init(void) +{ + cassert(AES_BLOCK_SIZE == (1 << AES_BLOCK_BITS)); + return 0; +} + +static int32_t check_key_aes_xts(uint32_t keysize) +{ + switch(keysize) { + case 256: + case 512: + return 0; + default: + break; + } + return -1; +} + +static int32_t encrypt_aes_xts(const unsigned char *from, + unsigned char *to, size_t length, + off_t offset, const int enc, + struct object_cipher_info *object) +{ + XTS128_CONTEXT ctx; + if (enc) { + ctx.key1 = &object->u.aes_xts.dkey[AES_ENCRYPT]; + ctx.block1 = (block128_f)AES_encrypt; + } + else { + ctx.key1 = &object->u.aes_xts.dkey[AES_DECRYPT]; + ctx.block1 = (block128_f)AES_decrypt; + } + ctx.key2 = &object->u.aes_xts.tkey; + ctx.block2 = (block128_f)AES_encrypt; + + return CRYPTO_xts128_encrypt(&ctx, + object->u.aes_xts.ivec, + from, + to, + length, enc); +} + +/* + * Cipher input chunk @from of length @len; + * @to: result of cipher transform; + * @off: offset in a file (must be cblock-aligned); + */ +static void cipher_data(struct object_cipher_info *object, + char *from, + char *to, + off_t off, + size_t len, + const int enc) +{ + crypt_check_input_len(len, object); + +#if TRIVIAL_TFM && DEBUG_CRYPT + return; +#endif + data_cipher_algs[object->o_alg][object->o_mode].set_iv(off, object); + data_cipher_algs[object->o_alg][object->o_mode].encrypt + ((const unsigned char *)from, + (unsigned char *)to, + len, + off, + enc, + object); +} + +#define MAX_CIPHER_CHUNK (1 << 30) + +/* + * Do cipher (encryption/decryption) transform of a + * continuous region of memory. + * + * @len: a number of bytes to transform; + * @buf: data to transform; + * @off: offset in a file, should be block-aligned + * for atomic cipher modes and ksize-aligned + * for other modes). + * @dir: direction of transform (encrypt/decrypt). + */ +static void cipher_region(struct object_cipher_info *object, + char *from, + char *to, + off_t off, + size_t len, + int dir) +{ + while (len > 0) { + size_t to_cipher; + + to_cipher = len; + if (to_cipher > MAX_CIPHER_CHUNK) + to_cipher = MAX_CIPHER_CHUNK; + + /* this will reset IV */ + cipher_data(object, + from, + to, + off, + to_cipher, + dir); + from += to_cipher; + to += to_cipher; + off += to_cipher; + len -= to_cipher; + } +} + +/* + * Do cipher transform (encryption/decryption) of + * plaintext/ciphertext represented by @vec. + * + * Pre-conditions: @vec represents a continuous piece + * of data in a file at offset @off to be ciphered + * (encrypted/decrypted). + * @count is the number of vec's components. All the + * components must be block-aligned, the caller is + * responsible for this. @dir is "direction" of + * transform (encrypt/decrypt). + */ +static void cipher_aligned_iov(struct object_cipher_info *object, + struct iovec *vec, + int count, + off_t off, + int32_t dir) +{ + int i; + int len = 0; + + for (i = 0; i < count; i++) { + cipher_region(object, + vec[i].iov_base, + vec[i].iov_base, + off + len, + vec[i].iov_len, + dir); + len += vec[i].iov_len; + } +} + +void encrypt_aligned_iov(struct object_cipher_info *object, + struct iovec *vec, + int count, + off_t off) +{ + cipher_aligned_iov(object, vec, count, off, 1); +} + +void decrypt_aligned_iov(struct object_cipher_info *object, + struct iovec *vec, + int count, + off_t off) +{ + cipher_aligned_iov(object, vec, count, off, 0); +} + +#if DEBUG_CRYPT +static void compound_stream(struct iovec *vec, int count, char *buf, off_t skip) +{ + int i; + int off = 0; + for (i = 0; i < count; i++) { + memcpy(buf + off, + vec[i].iov_base + skip, + vec[i].iov_len - skip); + + off += (vec[i].iov_len - skip); + skip = 0; + } +} + +static void check_iovecs(struct iovec *vec, int cnt, + struct iovec *avec, int acnt, uint32_t off_in_head) +{ + char *s1, *s2; + uint32_t size, asize; + + size = iovec_get_size(vec, cnt); + asize = iovec_get_size(avec, acnt) - off_in_head; + if (size != asize) { + gf_log("crypt", GF_LOG_DEBUG, "size %d is not eq asize %d", + size, asize); + return; + } + s1 = GF_CALLOC(1, size, gf_crypt_mt_data); + if (!s1) { + gf_log("crypt", GF_LOG_DEBUG, "Can not allocate stream "); + return; + } + s2 = GF_CALLOC(1, asize, gf_crypt_mt_data); + if (!s2) { + GF_FREE(s1); + gf_log("crypt", GF_LOG_DEBUG, "Can not allocate stream "); + return; + } + compound_stream(vec, cnt, s1, 0); + compound_stream(avec, acnt, s2, off_in_head); + if (memcmp(s1, s2, size)) + gf_log("crypt", GF_LOG_DEBUG, "chunks of different data"); + GF_FREE(s1); + GF_FREE(s2); +} + +#else +#define check_iovecs(vec, count, avec, avecn, off) noop +#endif /* DEBUG_CRYPT */ + +static char *data_alloc_block(xlator_t *this, crypt_local_t *local, + int32_t block_size) +{ + struct iobuf *iobuf = NULL; + + iobuf = iobuf_get2(this->ctx->iobuf_pool, block_size); + if (!iobuf) { + gf_log("crypt", GF_LOG_ERROR, + "Failed to get iobuf"); + return NULL; + } + if (!local->iobref_data) { + local->iobref_data = iobref_new(); + if (!local->iobref_data) { + gf_log("crypt", GF_LOG_ERROR, + "Failed to get iobref"); + iobuf_unref(iobuf); + return NULL; + } + } + iobref_add(local->iobref_data, iobuf); + return iobuf->ptr; +} + +/* + * Compound @avec, which represent the same data + * chunk as @vec, but has aligned components of + * specified block size. Alloc blocks, if needed. + * In particular, incomplete head and tail blocks + * must be allocated. + * Put number of allocated blocks to @num_blocks. + * + * Example: + * + * input: data chunk represented by 4 components + * [AB],[BC],[CD],[DE]; + * output: 5 logical blocks (0, 1, 2, 3, 4). + * + * A B C D E + * *-----*+------*-+---*----+--------+-* + * | || | | | | | | + * *-+-----+*------+-*---+----*--------*-+------* + * 0 1 2 3 4 + * + * 0 - incomplete compound (head); + * 1, 2 - full compound; + * 3 - full non-compound (the case of reuse); + * 4 - incomplete non-compound (tail). + */ +int32_t align_iov_by_atoms(xlator_t *this, + crypt_local_t *local, + struct object_cipher_info *object, + struct iovec *vec /* input vector */, + int32_t count /* number of vec components */, + struct iovec *avec /* aligned vector */, + char **blocks /* pool of blocks */, + uint32_t *blocks_allocated, + struct avec_config *conf) +{ + int vecn = 0; /* number of the current component in vec */ + int avecn = 0; /* number of the current component in avec */ + off_t vec_off = 0; /* offset in the current vec component, + * i.e. the number of bytes have already + * been copied */ + int32_t block_size = get_atom_size(object); + size_t to_process; /* number of vec's bytes to copy and(or) re-use */ + int32_t off_in_head = conf->off_in_head; + + to_process = iovec_get_size(vec, count); + + while (to_process > 0) { + if (off_in_head || + vec[vecn].iov_len - vec_off < block_size) { + /* + * less than block_size: + * the case of incomplete (head or tail), + * or compound block + */ + size_t copied = 0; + /* + * populate the pool with a new block + */ + blocks[*blocks_allocated] = data_alloc_block(this, + local, + block_size); + if (!blocks[*blocks_allocated]) + return -ENOMEM; + memset(blocks[*blocks_allocated], 0, off_in_head); + /* + * fill the block with vec components + */ + do { + size_t to_copy; + + to_copy = vec[vecn].iov_len - vec_off; + if (to_copy > block_size - off_in_head) + to_copy = block_size - off_in_head; + + memcpy(blocks[*blocks_allocated] + off_in_head + copied, + vec[vecn].iov_base + vec_off, + to_copy); + + copied += to_copy; + to_process -= to_copy; + + vec_off += to_copy; + if (vec_off == vec[vecn].iov_len) { + /* finished with this vecn */ + vec_off = 0; + vecn++; + } + } while (copied < (block_size - off_in_head) && to_process > 0); + /* + * update avec + */ + avec[avecn].iov_len = off_in_head + copied; + avec[avecn].iov_base = blocks[*blocks_allocated]; + + (*blocks_allocated)++; + off_in_head = 0; + } else { + /* + * the rest of the current vec component + * is not less than block_size, so reuse + * the memory buffer of the component. + */ + size_t to_reuse; + to_reuse = (to_process > block_size ? + block_size : + to_process); + avec[avecn].iov_len = to_reuse; + avec[avecn].iov_base = vec[vecn].iov_base + vec_off; + + vec_off += to_reuse; + if (vec_off == vec[vecn].iov_len) { + /* finished with this vecn */ + vec_off = 0; + vecn++; + } + to_process -= to_reuse; + } + avecn++; + } + check_iovecs(vec, count, avec, avecn, conf->off_in_head); + return 0; +} + +/* + * allocate and setup aligned vector for data submission + * Pre-condition: @conf is set. + */ +int32_t set_config_avec_data(xlator_t *this, + crypt_local_t *local, + struct avec_config *conf, + struct object_cipher_info *object, + struct iovec *vec, + int32_t vec_count) +{ + int32_t ret = ENOMEM; + struct iovec *avec; + char **pool; + uint32_t blocks_in_pool = 0; + + conf->type = DATA_ATOM; + + avec = GF_CALLOC(conf->acount, sizeof(*avec), gf_crypt_mt_iovec); + if (!avec) + return ret; + pool = GF_CALLOC(conf->acount, sizeof(pool), gf_crypt_mt_char); + if (!pool) { + GF_FREE(avec); + return ret; + } + if (!vec) { + /* + * degenerated case: no data + */ + pool[0] = data_alloc_block(this, local, get_atom_size(object)); + if (!pool[0]) + goto free; + blocks_in_pool = 1; + avec->iov_base = pool[0]; + avec->iov_len = conf->off_in_tail; + } + else { + ret = align_iov_by_atoms(this, local, object, vec, vec_count, + avec, pool, &blocks_in_pool, conf); + if (ret) + goto free; + } + conf->avec = avec; + conf->pool = pool; + conf->blocks_in_pool = blocks_in_pool; + return 0; + free: + GF_FREE(avec); + GF_FREE(pool); + return ret; +} + +/* + * allocate and setup aligned vector for hole submission + */ +int32_t set_config_avec_hole(xlator_t *this, + crypt_local_t *local, + struct avec_config *conf, + struct object_cipher_info *object, + glusterfs_fop_t fop) +{ + uint32_t i, idx; + struct iovec *avec; + char **pool; + uint32_t num_blocks; + uint32_t blocks_in_pool = 0; + + conf->type = HOLE_ATOM; + + num_blocks = conf->acount - + (conf->nr_full_blocks ? conf->nr_full_blocks - 1 : 0); + + switch (fop) { + case GF_FOP_WRITE: + /* + * hole goes before data + */ + if (num_blocks == 1 && conf->off_in_tail != 0) + /* + * we won't submit a hole which fits into + * a data atom: this part of hole will be + * submitted with data write + */ + return 0; + break; + case GF_FOP_FTRUNCATE: + /* + * expanding truncate, hole goes after data, + * and will be submited in any case. + */ + break; + default: + gf_log("crypt", GF_LOG_WARNING, + "bad file operation %d", fop); + return 0; + } + avec = GF_CALLOC(num_blocks, sizeof(*avec), gf_crypt_mt_iovec); + if (!avec) + return ENOMEM; + pool = GF_CALLOC(num_blocks, sizeof(pool), gf_crypt_mt_char); + if (!pool) { + GF_FREE(avec); + return ENOMEM; + } + for (i = 0; i < num_blocks; i++) { + pool[i] = data_alloc_block(this, local, get_atom_size(object)); + if (pool[i] == NULL) + goto free; + blocks_in_pool++; + } + if (has_head_block(conf)) { + /* set head block */ + idx = 0; + avec[idx].iov_base = pool[idx]; + avec[idx].iov_len = get_atom_size(object); + memset(avec[idx].iov_base + conf->off_in_head, + 0, + get_atom_size(object) - conf->off_in_head); + } + if (has_tail_block(conf)) { + /* set tail block */ + idx = num_blocks - 1; + avec[idx].iov_base = pool[idx]; + avec[idx].iov_len = get_atom_size(object); + memset(avec[idx].iov_base, 0, conf->off_in_tail); + } + if (has_full_blocks(conf)) { + /* set full block */ + idx = conf->off_in_head ? 1 : 0; + avec[idx].iov_base = pool[idx]; + avec[idx].iov_len = get_atom_size(object); + /* + * since we re-use the buffer, + * zeroes will be set every time + * before encryption, see submit_full() + */ + } + conf->avec = avec; + conf->pool = pool; + conf->blocks_in_pool = blocks_in_pool; + return 0; + free: + GF_FREE(avec); + GF_FREE(pool); + return ENOMEM; +} + +/* A helper for setting up config of partial atoms (which + * participate in read-modify-write sequence). + * + * Calculate and setup precise amount of "extra-bytes" + * that should be uptodated at the end of partial (not + * necessarily tail!) block. + * + * Pre-condition: local->old_file_size is valid! + * @conf contains setup, which is enough for correct calculation + * of has_tail_block(), ->get_offset(). + */ +void set_gap_at_end(call_frame_t *frame, struct object_cipher_info *object, + struct avec_config *conf, atom_data_type dtype) +{ + uint32_t to_block; + crypt_local_t *local = frame->local; + uint64_t old_file_size = local->old_file_size; + struct rmw_atom *partial = atom_by_types(dtype, + has_tail_block(conf) ? + TAIL_ATOM : HEAD_ATOM); + + if (old_file_size <= partial->offset_at(frame, object)) + to_block = 0; + else { + to_block = old_file_size - partial->offset_at(frame, object); + if (to_block > get_atom_size(object)) + to_block = get_atom_size(object); + } + if (to_block > conf->off_in_tail) + conf->gap_in_tail = to_block - conf->off_in_tail; + else + /* + * nothing to uptodate + */ + conf->gap_in_tail = 0; +} + +/* + * fill struct avec_config with offsets layouts + */ +void set_config_offsets(call_frame_t *frame, + xlator_t *this, + uint64_t offset, + uint64_t count, + atom_data_type dtype, + int32_t set_gap) +{ + crypt_local_t *local; + struct object_cipher_info *object; + struct avec_config *conf; + uint32_t resid; + + uint32_t atom_size; + uint32_t atom_bits; + + size_t orig_size; + off_t orig_offset; + size_t expanded_size; + off_t aligned_offset; + + uint32_t off_in_head = 0; + uint32_t off_in_tail = 0; + uint32_t nr_full_blocks; + int32_t size_full_blocks; + + uint32_t acount; /* number of alifned components to write. + * The same as number of occupied logical + * blocks (atoms) + */ + local = frame->local; + object = &local->info->cinfo; + conf = (dtype == DATA_ATOM ? + get_data_conf(frame) : get_hole_conf(frame)); + + orig_offset = offset; + orig_size = count; + + atom_size = get_atom_size(object); + atom_bits = get_atom_bits(object); + + /* + * Round-down the start, + * round-up the end. + */ + resid = offset & (uint64_t)(atom_size - 1); + + if (resid) + off_in_head = resid; + aligned_offset = offset - off_in_head; + expanded_size = orig_size + off_in_head; + + /* calculate tail, + expand size forward */ + resid = (offset + orig_size) & (uint64_t)(atom_size - 1); + + if (resid) { + off_in_tail = resid; + expanded_size += (atom_size - off_in_tail); + } + /* + * calculate number of occupied blocks + */ + acount = expanded_size >> atom_bits; + /* + * calculate number of full blocks + */ + size_full_blocks = expanded_size; + if (off_in_head) + size_full_blocks -= atom_size; + if (off_in_tail && size_full_blocks > 0) + size_full_blocks -= atom_size; + nr_full_blocks = size_full_blocks >> atom_bits; + + conf->atom_size = atom_size; + conf->orig_size = orig_size; + conf->orig_offset = orig_offset; + conf->expanded_size = expanded_size; + conf->aligned_offset = aligned_offset; + + conf->off_in_head = off_in_head; + conf->off_in_tail = off_in_tail; + conf->nr_full_blocks = nr_full_blocks; + conf->acount = acount; + /* + * Finally, calculate precise amount of + * "extra-bytes" that should be uptodated + * at the end. + * Only if RMW is expected. + */ + if (off_in_tail && set_gap) + set_gap_at_end(frame, object, conf, dtype); +} + +struct data_cipher_alg data_cipher_algs[LAST_CIPHER_ALG][LAST_CIPHER_MODE] = { + [AES_CIPHER_ALG][XTS_CIPHER_MODE] = + { .atomic = _gf_true, + .should_pad = _gf_true, + .blkbits = AES_BLOCK_BITS, + .init = aes_xts_init, + .set_private = set_private_aes_xts, + .check_key = check_key_aes_xts, + .set_iv = set_iv_aes_xts, + .encrypt = encrypt_aes_xts + } +}; + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/keys.c b/xlators/encryption/crypt/src/keys.c new file mode 100644 index 00000000000..4a1d3bb5a09 --- /dev/null +++ b/xlators/encryption/crypt/src/keys.c @@ -0,0 +1,302 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif + +#include "defaults.h" +#include "crypt-common.h" +#include "crypt.h" + +/* Key hierarchy + + +----------------+ + | MASTER_VOL_KEY | + +-------+--------+ + | + | + +----------------+----------------+ + | | | + | | | + +-------+------+ +-------+-------+ +------+--------+ + | NMTD_VOL_KEY | | EMTD_FILE_KEY | | DATA_FILE_KEY | + +-------+------+ +---------------+ +---------------+ + | + | + +-------+-------+ + | NMTD_LINK_KEY | + +---------------+ + + */ + +#if DEBUG_CRYPT +static void check_prf_iters(uint32_t num_iters) +{ + if (num_iters == 0) + gf_log ("crypt", GF_LOG_DEBUG, + "bad number of prf iterations : %d", num_iters); +} +#else +#define check_prf_iters(num_iters) noop +#endif /* DEBUG_CRYPT */ + +unsigned char crypt_fake_oid[16] = + {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; + +/* + * derive key in the counter mode using + * sha256-based HMAC as PRF, see + * NIST Special Publication 800-108, 5.1) + */ + +#define PRF_OUTPUT_SIZE SHA256_DIGEST_LENGTH + +static int32_t kderive_init(struct kderive_context *ctx, + const unsigned char *pkey, /* parent key */ + uint32_t pkey_size, /* parent key size */ + const unsigned char *idctx, /* id-context */ + uint32_t idctx_size, + crypt_key_type type /* type of child key */) +{ + unsigned char *pos; + uint32_t llen = strlen(crypt_keys[type].label); + /* + * Compoud the fixed input data for KDF: + * [i]_2 || Label || 0x00 || Id-Context || [L]_2), + * NIST SP 800-108, 5.1 + */ + ctx->fid_len = + sizeof(uint32_t) + + llen + + 1 + + idctx_size + + sizeof(uint32_t); + + ctx->fid = GF_CALLOC(ctx->fid_len, 1, gf_crypt_mt_key); + if (!ctx->fid) + return ENOMEM; + ctx->out_len = round_up(crypt_keys[type].len >> 3, + PRF_OUTPUT_SIZE); + ctx->out = GF_CALLOC(ctx->out_len, 1, gf_crypt_mt_key); + if (!ctx->out) { + GF_FREE(ctx->fid); + return ENOMEM; + } + ctx->pkey = pkey; + ctx->pkey_len = pkey_size; + ctx->ckey_len = crypt_keys[type].len; + + pos = ctx->fid; + + /* counter will be set up in kderive_rfn() */ + pos += sizeof(uint32_t); + + memcpy(pos, crypt_keys[type].label, llen); + pos += llen; + + /* set up zero octet */ + *pos = 0; + pos += 1; + + memcpy(pos, idctx, idctx_size); + pos += idctx_size; + + *((uint32_t *)pos) = htobe32(ctx->ckey_len); + + return 0; +} + +static void kderive_update(struct kderive_context *ctx) +{ + uint32_t i; + HMAC_CTX hctx; + unsigned char *pos = ctx->out; + uint32_t *p_iter = (uint32_t *)ctx->fid; + uint32_t num_iters = ctx->out_len / PRF_OUTPUT_SIZE; + + check_prf_iters(num_iters); + + HMAC_CTX_init(&hctx); + for (i = 0; i < num_iters; i++) { + /* + * update the iteration number in the fid + */ + *p_iter = htobe32(i); + HMAC_Init_ex(&hctx, + ctx->pkey, ctx->pkey_len >> 3, + EVP_sha256(), + NULL); + HMAC_Update(&hctx, ctx->fid, ctx->fid_len); + HMAC_Final(&hctx, pos, NULL); + + pos += PRF_OUTPUT_SIZE; + } + HMAC_CTX_cleanup(&hctx); +} + +static void kderive_final(struct kderive_context *ctx, unsigned char *child) +{ + memcpy(child, ctx->out, ctx->ckey_len >> 3); + GF_FREE(ctx->fid); + GF_FREE(ctx->out); + memset(ctx, 0, sizeof(*ctx)); +} + +/* + * derive per-volume key for object ids aithentication + */ +int32_t get_nmtd_vol_key(struct master_cipher_info *master) +{ + int32_t ret; + struct kderive_context ctx; + + ret = kderive_init(&ctx, + master->m_key, + master_key_size(), + crypt_fake_oid, sizeof(uuid_t), NMTD_VOL_KEY); + if (ret) + return ret; + kderive_update(&ctx); + kderive_final(&ctx, master->m_nmtd_key); + return 0; +} + +/* + * derive per-link key for aithentication of non-encrypted + * meta-data (nmtd) + */ +int32_t get_nmtd_link_key(loc_t *loc, + struct master_cipher_info *master, + unsigned char *result) +{ + int32_t ret; + struct kderive_context ctx; + + ret = kderive_init(&ctx, + master->m_nmtd_key, + nmtd_vol_key_size(), + (const unsigned char *)loc->path, + strlen(loc->path), NMTD_LINK_KEY); + if (ret) + return ret; + kderive_update(&ctx); + kderive_final(&ctx, result); + return 0; +} + +/* + * derive per-file key for encryption and authentication + * of encrypted part of metadata (emtd) + */ +int32_t get_emtd_file_key(struct crypt_inode_info *info, + struct master_cipher_info *master, + unsigned char *result) +{ + int32_t ret; + struct kderive_context ctx; + + ret = kderive_init(&ctx, + master->m_key, + master_key_size(), + info->oid, sizeof(uuid_t), EMTD_FILE_KEY); + if (ret) + return ret; + kderive_update(&ctx); + kderive_final(&ctx, result); + return 0; +} + +static int32_t data_key_type_by_size(uint32_t keysize, crypt_key_type *type) +{ + int32_t ret = 0; + switch (keysize) { + case 256: + *type = DATA_FILE_KEY_256; + break; + case 512: + *type = DATA_FILE_KEY_512; + break; + default: + gf_log("crypt", GF_LOG_ERROR, "Unsupported data key size %d", + keysize); + ret = ENOTSUP; + break; + } + return ret; +} + +/* + * derive per-file key for data encryption + */ +int32_t get_data_file_key(struct crypt_inode_info *info, + struct master_cipher_info *master, + uint32_t keysize, + unsigned char *key) +{ + int32_t ret; + struct kderive_context ctx; + crypt_key_type type; + + ret = data_key_type_by_size(keysize, &type); + if (ret) + return ret; + ret = kderive_init(&ctx, + master->m_key, + master_key_size(), + info->oid, sizeof(uuid_t), type); + if (ret) + return ret; + kderive_update(&ctx); + kderive_final(&ctx, key); + return 0; +} + +/* + * NOTE: Don't change existing keys: it will break compatibility; + */ +struct crypt_key crypt_keys[LAST_KEY_TYPE] = { + [MASTER_VOL_KEY] = + { .len = MASTER_VOL_KEY_SIZE << 3, + .label = "volume-master", + }, + [NMTD_VOL_KEY] = + { .len = NMTD_VOL_KEY_SIZE << 3, + .label = "volume-nmtd-key-generation" + }, + [NMTD_LINK_KEY] = + { .len = 128, + .label = "link-nmtd-authentication" + }, + [EMTD_FILE_KEY] = + { .len = 128, + .label = "file-emtd-encryption-and-auth" + }, + [DATA_FILE_KEY_256] = + { .len = 256, + .label = "file-data-encryption-256" + }, + [DATA_FILE_KEY_512] = + { .len = 512, + .label = "file-data-encryption-512" + } +}; + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/metadata.c b/xlators/encryption/crypt/src/metadata.c new file mode 100644 index 00000000000..36b14c0558e --- /dev/null +++ b/xlators/encryption/crypt/src/metadata.c @@ -0,0 +1,605 @@ +/* + Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef _CONFIG_H +#define _CONFIG_H +#include "config.h" +#endif + +#include "defaults.h" +#include "crypt-common.h" +#include "crypt.h" +#include "metadata.h" + +int32_t alloc_format(crypt_local_t *local, size_t size) +{ + if (size > 0) { + local->format = GF_CALLOC(1, size, gf_crypt_mt_mtd); + if (!local->format) + return ENOMEM; + } + local->format_size = size; + return 0; +} + +int32_t alloc_format_create(crypt_local_t *local) +{ + return alloc_format(local, new_format_size()); +} + +void free_format(crypt_local_t *local) +{ + GF_FREE(local->format); +} + +/* + * Check compatibility with extracted metadata + */ +static int32_t check_file_metadata(struct crypt_inode_info *info) +{ + struct object_cipher_info *object = &info->cinfo; + + if (info->nr_minor != CRYPT_XLATOR_ID) { + gf_log("crypt", GF_LOG_WARNING, + "unsupported minor subversion %d", info->nr_minor); + return EINVAL; + } + if (object->o_alg > LAST_CIPHER_ALG) { + gf_log("crypt", GF_LOG_WARNING, + "unsupported cipher algorithm %d", + object->o_alg); + return EINVAL; + } + if (object->o_mode > LAST_CIPHER_MODE) { + gf_log("crypt", GF_LOG_WARNING, + "unsupported cipher mode %d", + object->o_mode); + return EINVAL; + } + if (object->o_block_bits < CRYPT_MIN_BLOCK_BITS || + object->o_block_bits > CRYPT_MAX_BLOCK_BITS) { + gf_log("crypt", GF_LOG_WARNING, "unsupported block bits %d", + object->o_block_bits); + return EINVAL; + } + /* TBD: check data key size */ + return 0; +} + +static size_t format_size_v1(mtd_op_t op, size_t old_size) +{ + + switch (op) { + case MTD_CREATE: + return sizeof(struct mtd_format_v1); + case MTD_OVERWRITE: + return old_size; + case MTD_APPEND: + return old_size + NMTD_8_MAC_SIZE; + case MTD_CUT: + if (old_size > sizeof(struct mtd_format_v1)) + return old_size - NMTD_8_MAC_SIZE; + else + return 0; + default: + gf_log("crypt", GF_LOG_WARNING, "Bad mtd operation"); + return 0; + } +} + +/* + * Calculate size of the updated format string. + * Returned zero means that we don't need to update the format string. + */ +size_t format_size(mtd_op_t op, size_t old_size) +{ + size_t versioned; + + versioned = mtd_loaders[current_mtd_loader()].format_size(op, + old_size - sizeof(struct crypt_format)); + if (versioned != 0) + return versioned + sizeof(struct crypt_format); + return 0; +} + +/* + * size of the format string of newly created file (nr_links = 1) + */ +size_t new_format_size(void) +{ + return format_size(MTD_CREATE, 0); +} + +/* + * Calculate per-link MAC by pathname + */ +static int32_t calc_link_mac_v1(struct mtd_format_v1 *fmt, + loc_t *loc, + unsigned char *result, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + unsigned char nmtd_link_key[16]; + CMAC_CTX *cctx; + size_t len; + + ret = get_nmtd_link_key(loc, master, nmtd_link_key); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Can not get nmtd link key"); + return -1; + } + cctx = CMAC_CTX_new(); + if (!cctx) { + gf_log("crypt", GF_LOG_ERROR, "CMAC_CTX_new failed"); + return -1; + } + ret = CMAC_Init(cctx, nmtd_link_key, sizeof(nmtd_link_key), + EVP_aes_128_cbc(), 0); + if (!ret) { + gf_log("crypt", GF_LOG_ERROR, "CMAC_Init failed"); + CMAC_CTX_free(cctx); + return -1; + } + ret = CMAC_Update(cctx, get_NMTD_V1(info), SIZE_OF_NMTD_V1); + if (!ret) { + gf_log("crypt", GF_LOG_ERROR, "CMAC_Update failed"); + CMAC_CTX_free(cctx); + return -1; + } + ret = CMAC_Final(cctx, result, &len); + CMAC_CTX_free(cctx); + if (!ret) { + gf_log("crypt", GF_LOG_ERROR, "CMAC_Final failed"); + return -1; + } + return 0; +} + +/* + * Create per-link MAC of index @idx by pathname + */ +static int32_t create_link_mac_v1(struct mtd_format_v1 *fmt, + uint32_t idx, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + unsigned char *mac; + unsigned char cmac[16]; + + mac = get_NMTD_V1_MAC(fmt) + idx * SIZE_OF_NMTD_V1_MAC; + + ret = calc_link_mac_v1(fmt, loc, cmac, info, master); + if (ret) + return -1; + memcpy(mac, cmac, SIZE_OF_NMTD_V1_MAC); + return 0; +} + +static int32_t create_format_v1(unsigned char *wire, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + struct mtd_format_v1 *fmt; + unsigned char mtd_key[16]; + AES_KEY EMTD_KEY; + unsigned char nmtd_link_key[16]; + uint32_t ad; + GCM128_CONTEXT *gctx; + + fmt = (struct mtd_format_v1 *)wire; + + fmt->minor_id = info->nr_minor; + fmt->alg_id = AES_CIPHER_ALG; + fmt->dkey_factor = master->m_dkey_size >> KEY_FACTOR_BITS; + fmt->block_bits = master->m_block_bits; + fmt->mode_id = master->m_mode; + /* + * retrieve keys for the parts of metadata + */ + ret = get_emtd_file_key(info, master, mtd_key); + if (ret) + return ret; + ret = get_nmtd_link_key(loc, master, nmtd_link_key); + if (ret) + return ret; + + AES_set_encrypt_key(mtd_key, sizeof(mtd_key)*8, &EMTD_KEY); + + gctx = CRYPTO_gcm128_new(&EMTD_KEY, (block128_f)AES_encrypt); + + /* TBD: Check return values */ + + CRYPTO_gcm128_setiv(gctx, info->oid, sizeof(uuid_t)); + + ad = htole32(MTD_LOADER_V1); + ret = CRYPTO_gcm128_aad(gctx, (const unsigned char *)&ad, sizeof(ad)); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, " CRYPTO_gcm128_aad failed"); + CRYPTO_gcm128_release(gctx); + return ret; + } + ret = CRYPTO_gcm128_encrypt(gctx, + get_EMTD_V1(fmt), + get_EMTD_V1(fmt), + SIZE_OF_EMTD_V1); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, " CRYPTO_gcm128_encrypt failed"); + CRYPTO_gcm128_release(gctx); + return ret; + } + /* + * set MAC of encrypted part of metadata + */ + CRYPTO_gcm128_tag(gctx, get_EMTD_V1_MAC(fmt), SIZE_OF_EMTD_V1_MAC); + CRYPTO_gcm128_release(gctx); + /* + * set the first MAC of non-encrypted part of metadata + */ + return create_link_mac_v1(fmt, 0, loc, info, master); +} + +/* + * Called by fops: + * ->create(); + * ->link(); + * + * Pack common and version-specific parts of file's metadata + * Pre-conditions: @info contains valid object-id. + */ +int32_t create_format(unsigned char *wire, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + struct crypt_format *fmt = (struct crypt_format *)wire; + + fmt->loader_id = current_mtd_loader(); + + wire += sizeof(struct crypt_format); + return mtd_loaders[current_mtd_loader()].create_format(wire, loc, + info, master); +} + +/* + * Append or overwrite per-link mac of @mac_idx index + * in accordance with the new pathname + */ +int32_t appov_link_mac_v1(unsigned char *new, + unsigned char *old, + uint32_t old_size, + int32_t mac_idx, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local) +{ + memcpy(new, old, old_size); + return create_link_mac_v1((struct mtd_format_v1 *)new, mac_idx, + loc, info, master); +} + +/* + * Cut per-link mac of @mac_idx index + */ +static int32_t cut_link_mac_v1(unsigned char *new, + unsigned char *old, + uint32_t old_size, + int32_t mac_idx, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local) +{ + memcpy(new, + old, + sizeof(struct mtd_format_v1) + NMTD_8_MAC_SIZE * (mac_idx - 1)); + + memcpy(new + sizeof(struct mtd_format_v1) + NMTD_8_MAC_SIZE * (mac_idx - 1), + old + sizeof(struct mtd_format_v1) + NMTD_8_MAC_SIZE * mac_idx, + old_size - (sizeof(struct mtd_format_v1) + NMTD_8_MAC_SIZE * mac_idx)); + return 0; +} + +int32_t update_format_v1(unsigned char *new, + unsigned char *old, + size_t old_len, + int32_t mac_idx, /* of old name */ + mtd_op_t op, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local) +{ + switch (op) { + case MTD_APPEND: + mac_idx = 1 + (old_len - sizeof(struct mtd_format_v1))/8; + case MTD_OVERWRITE: + return appov_link_mac_v1(new, old, old_len, mac_idx, + loc, info, master, local); + case MTD_CUT: + return cut_link_mac_v1(new, old, old_len, mac_idx, + loc, info, master, local); + default: + gf_log("crypt", GF_LOG_ERROR, "Bad mtd operation %d", op); + return -1; + } +} + +/* + * Called by fops: + * + * ->link() + * ->unlink() + * ->rename() + * + */ +int32_t update_format(unsigned char *new, + unsigned char *old, + size_t old_len, + int32_t mac_idx, + mtd_op_t op, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local) +{ + if (!new) + return 0; + memcpy(new, old, sizeof(struct crypt_format)); + + old += sizeof(struct crypt_format); + new += sizeof(struct crypt_format); + old_len -= sizeof(struct crypt_format); + + return mtd_loaders[current_mtd_loader()].update_format(new, old, + old_len, + mac_idx, op, + loc, info, + master, local); +} + +/* + * Perform preliminary checks of found metadata + * Return < 0 on errors; + * Return number of object-id MACs (>= 1) on success + */ +int32_t check_format_v1(uint32_t len, unsigned char *wire) +{ + uint32_t nr_links; + + if (len < sizeof(struct mtd_format_v1)) { + gf_log("crypt", GF_LOG_ERROR, + "v1-loader: bad metadata size %d", len); + goto error; + } + len -= sizeof(struct mtd_format_v1); + if (len % sizeof(nmtd_8_mac_t)) { + gf_log("crypt", GF_LOG_ERROR, + "v1-loader: bad metadata format"); + goto error; + } + nr_links = 1 + len / sizeof(nmtd_8_mac_t); + if (nr_links > _POSIX_LINK_MAX) + goto error; + return nr_links; + error: + return EIO; +} + +/* + * Verify per-link MAC specified by index @idx + * + * return: + * -1 on errors; + * 0 on failed verification; + * 1 on sucessful verification + */ +static int32_t verify_link_mac_v1(struct mtd_format_v1 *fmt, + uint32_t idx /* index of the mac to verify */, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + unsigned char *mac; + unsigned char cmac[16]; + + mac = get_NMTD_V1_MAC(fmt) + idx * SIZE_OF_NMTD_V1_MAC; + + ret = calc_link_mac_v1(fmt, loc, cmac, info, master); + if (ret) + return -1; + if (memcmp(cmac, mac, SIZE_OF_NMTD_V1_MAC)) + return 0; + return 1; +} + +/* + * Lookup per-link MAC by pathname. + * + * return index of the MAC, if it was found; + * return < 0 on errors, or if the MAC wasn't found + */ +static int32_t lookup_link_mac_v1(struct mtd_format_v1 *fmt, + uint32_t nr_macs, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master) +{ + int32_t ret; + uint32_t idx; + + for (idx = 0; idx < nr_macs; idx++) { + ret = verify_link_mac_v1(fmt, idx, loc, info, master); + if (ret < 0) + return ret; + if (ret > 0) + return idx; + } + return -ENOENT; +} + +/* + * Extract version-specific part of metadata + */ +static int32_t open_format_v1(unsigned char *wire, + int32_t len, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local, + gf_boolean_t load_info) +{ + int32_t ret; + int32_t num_nmtd_macs; + struct mtd_format_v1 *fmt; + unsigned char mtd_key[16]; + AES_KEY EMTD_KEY; + GCM128_CONTEXT *gctx; + uint32_t ad; + emtd_8_mac_t gmac; + struct object_cipher_info *object; + + num_nmtd_macs = check_format_v1(len, wire); + if (num_nmtd_macs <= 0) + return EIO; + fmt = (struct mtd_format_v1 *)wire; + + ret = lookup_link_mac_v1(fmt, num_nmtd_macs, loc, info, master); + if (ret < 0) { + gf_log("crypt", GF_LOG_ERROR, "NMTD verification failed"); + return EINVAL; + } + local->mac_idx = ret; + if (load_info == _gf_false) + /* the case of partial open */ + return 0; + + object = &info->cinfo; + + ret = get_emtd_file_key(info, master, mtd_key); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, "Can not retrieve metadata key"); + return ret; + } + /* + * decrypt encrypted meta-data + */ + ret = AES_set_encrypt_key(mtd_key, sizeof(mtd_key)*8, &EMTD_KEY); + if (ret < 0) { + gf_log("crypt", GF_LOG_ERROR, "Can not set encrypt key"); + return ret; + } + gctx = CRYPTO_gcm128_new(&EMTD_KEY, (block128_f)AES_encrypt); + if (!gctx) { + gf_log("crypt", GF_LOG_ERROR, "Can not alloc gcm context"); + return ENOMEM; + } + CRYPTO_gcm128_setiv(gctx, info->oid, sizeof(uuid_t)); + + ad = htole32(MTD_LOADER_V1); + ret = CRYPTO_gcm128_aad(gctx, (const unsigned char *)&ad, sizeof(ad)); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, " CRYPTO_gcm128_aad failed"); + CRYPTO_gcm128_release(gctx); + return ret; + } + ret = CRYPTO_gcm128_decrypt(gctx, + get_EMTD_V1(fmt), + get_EMTD_V1(fmt), + SIZE_OF_EMTD_V1); + if (ret) { + gf_log("crypt", GF_LOG_ERROR, " CRYPTO_gcm128_decrypt failed"); + CRYPTO_gcm128_release(gctx); + return ret; + } + /* + * verify metadata + */ + CRYPTO_gcm128_tag(gctx, gmac, sizeof(gmac)); + CRYPTO_gcm128_release(gctx); + if (memcmp(gmac, get_EMTD_V1_MAC(fmt), SIZE_OF_EMTD_V1_MAC)) { + gf_log("crypt", GF_LOG_ERROR, "EMTD verification failed"); + return EINVAL; + } + /* + * load verified metadata to the private part of inode + */ + info->nr_minor = fmt->minor_id; + + object->o_alg = fmt->alg_id; + object->o_dkey_size = fmt->dkey_factor << KEY_FACTOR_BITS; + object->o_block_bits = fmt->block_bits; + object->o_mode = fmt->mode_id; + + return check_file_metadata(info); +} + +/* + * perform metadata authentication against @loc->path; + * extract crypt-specific attribtes and populate @info + * with them (optional) + */ +int32_t open_format(unsigned char *str, + int32_t len, + loc_t *loc, + struct crypt_inode_info *info, + struct master_cipher_info *master, + crypt_local_t *local, + gf_boolean_t load_info) +{ + struct crypt_format *fmt; + if (len < sizeof(*fmt)) { + gf_log("crypt", GF_LOG_ERROR, "Bad core format"); + return EIO; + } + fmt = (struct crypt_format *)str; + + if (fmt->loader_id >= LAST_MTD_LOADER) { + gf_log("crypt", GF_LOG_ERROR, + "Unsupported loader id %d", fmt->loader_id); + return EINVAL; + } + str += sizeof(*fmt); + len -= sizeof(*fmt); + + return mtd_loaders[fmt->loader_id].open_format(str, + len, + loc, + info, + master, + local, + load_info); +} + +struct crypt_mtd_loader mtd_loaders [LAST_MTD_LOADER] = { + [MTD_LOADER_V1] = + {.format_size = format_size_v1, + .create_format = create_format_v1, + .open_format = open_format_v1, + .update_format = update_format_v1 + } +}; + +/* + Local variables: + c-indentation-style: "K&R" + mode-name: "LC" + c-basic-offset: 8 + tab-width: 8 + fill-column: 80 + scroll-step: 1 + End: +*/ diff --git a/xlators/encryption/crypt/src/metadata.h b/xlators/encryption/crypt/src/metadata.h new file mode 100644 index 00000000000..a92f149ef50 --- /dev/null +++ b/xlators/encryption/crypt/src/metadata.h @@ -0,0 +1,74 @@ +/* + Copyright (c) 2008-2013 Red Hat, Inc. <http://www.redhat.com> + This file is part of GlusterFS. + + This file is licensed to you under your choice of the GNU Lesser + General Public License, version 3 or any later version (LGPLv3 or + later), or the GNU General Public License, version 2 (GPLv2), in all + cases as published by the Free Software Foundation. +*/ + +#ifndef __METADATA_H__ +#define __METADATA_H__ + +#define NMTD_8_MAC_SIZE (8) +#define EMTD_8_MAC_SIZE (8) + +typedef uint8_t nmtd_8_mac_t[NMTD_8_MAC_SIZE]; +typedef uint8_t emtd_8_mac_t[EMTD_8_MAC_SIZE] ; + +/* + * Version "v1" of file's metadata. + * Metadata of this version has 4 components: + * + * 1) EMTD (Encrypted part of MeTaData); + * 2) NMTD (Non-encrypted part of MeTaData); + * 3) EMTD_MAC; (EMTD Message Authentication Code); + * 4) Array of per-link NMTD MACs (for every (hard)link it includes + * exactly one MAC) + */ +struct mtd_format_v1 { + /* EMTD, encrypted part of meta-data */ + uint8_t alg_id; /* cipher algorithm id (only AES for now) */ + uint8_t mode_id; /* cipher mode id; (only XTS for now) */ + uint8_t block_bits; /* encoded block size */ + uint8_t minor_id; /* client translator id */ + uint8_t dkey_factor; /* encoded size of the data key */ + /* MACs */ + emtd_8_mac_t gmac; /* MAC of the encrypted meta-data, 8 bytes */ + nmtd_8_mac_t omac; /* per-link MACs of the non-encrypted + * meta-data: at least one such MAC is always + * present */ +} __attribute__((packed)); + +/* + * NMTD, the non-encrypted part of metadata of version "v1" + * is file's gfid, which is generated on trusted machines. + */ +#define SIZE_OF_NMTD_V1 (sizeof(uuid_t)) +#define SIZE_OF_EMTD_V1 (offsetof(struct mtd_format_v1, gmac) - \ + offsetof(struct mtd_format_v1, alg_id)) +#define SIZE_OF_NMTD_V1_MAC (NMTD_8_MAC_SIZE) +#define SIZE_OF_EMTD_V1_MAC (EMTD_8_MAC_SIZE) + +static inline unsigned char *get_EMTD_V1(struct mtd_format_v1 *format) +{ + return &format->alg_id; +} + +static inline unsigned char *get_NMTD_V1(struct crypt_inode_info *info) +{ + return info->oid; +} + +static inline unsigned char *get_EMTD_V1_MAC(struct mtd_format_v1 *format) +{ + return format->gmac; +} + +static inline unsigned char *get_NMTD_V1_MAC(struct mtd_format_v1 *format) +{ + return format->omac; +} + +#endif /* __METADATA_H__ */ diff --git a/xlators/mgmt/glusterd/src/glusterd-volgen.c b/xlators/mgmt/glusterd/src/glusterd-volgen.c index 51fba4da343..6af52abe286 100644 --- a/xlators/mgmt/glusterd/src/glusterd-volgen.c +++ b/xlators/mgmt/glusterd/src/glusterd-volgen.c @@ -2452,6 +2452,29 @@ out: return ret; } +static int client_graph_set_perf_options(volgen_graph_t *graph, + glusterd_volinfo_t *volinfo, + dict_t *set_dict) +{ + data_t *tmp_data = NULL; + char *volname = NULL; + + /* + * Logic to make sure NFS doesn't have performance translators by + * default for a volume + */ + volname = volinfo->volname; + tmp_data = dict_get (set_dict, "nfs-volume-file"); + if (!tmp_data) + return volgen_graph_set_options_generic(graph, set_dict, + volname, + &perfxl_option_handler); + else + return volgen_graph_set_options_generic(graph, set_dict, + volname, + &nfsperfxl_option_handler); +} + static int client_graph_builder (volgen_graph_t *graph, glusterd_volinfo_t *volinfo, dict_t *set_dict, void *param) @@ -2459,7 +2482,6 @@ client_graph_builder (volgen_graph_t *graph, glusterd_volinfo_t *volinfo, int ret = 0; xlator_t *xl = NULL; char *volname = NULL; - data_t *tmp_data = NULL; volname = volinfo->volname; ret = volgen_graph_build_clients (graph, volinfo, set_dict, param); @@ -2483,6 +2505,18 @@ client_graph_builder (volgen_graph_t *graph, glusterd_volinfo_t *volinfo, } + ret = glusterd_volinfo_get_boolean (volinfo, "features.encryption"); + if (ret == -1) + goto out; + if (ret) { + xl = volgen_graph_add (graph, "encryption/crypt", volname); + + if (!xl) { + ret = -1; + goto out; + } + } + ret = glusterd_volinfo_get_boolean (volinfo, VKEY_FEATURES_QUOTA); if (ret == -1) goto out; @@ -2508,16 +2542,7 @@ client_graph_builder (volgen_graph_t *graph, glusterd_volinfo_t *volinfo, } } - /* Logic to make sure NFS doesn't have performance translators by - default for a volume */ - tmp_data = dict_get (set_dict, "nfs-volume-file"); - if (!tmp_data) - ret = volgen_graph_set_options_generic (graph, set_dict, volinfo, - &perfxl_option_handler); - else - ret = volgen_graph_set_options_generic (graph, set_dict, volname, - &nfsperfxl_option_handler); - + ret = client_graph_set_perf_options(graph, volinfo, set_dict); if (ret) goto out; diff --git a/xlators/mgmt/glusterd/src/glusterd-volume-set.c b/xlators/mgmt/glusterd/src/glusterd-volume-set.c index a035098d8d0..665a8b29859 100644 --- a/xlators/mgmt/glusterd/src/glusterd-volume-set.c +++ b/xlators/mgmt/glusterd/src/glusterd-volume-set.c @@ -724,6 +724,34 @@ struct volopt_map_entry glusterd_volopt_map[] = { .flags = OPT_FLAG_CLIENT_OPT }, + /* Crypt xlator options */ + + { .key = "features.encryption", + .voltype = "encryption/crypt", + .option = "!feat", + .value = "off", + .op_version = 3, + .description = "enable/disable client-side encryption for " + "the volume.", + .flags = OPT_FLAG_CLIENT_OPT | OPT_FLAG_XLATOR_OPT + }, + + { .key = "encryption.master-key", + .voltype = "encryption/crypt", + .op_version = 3, + .flags = OPT_FLAG_CLIENT_OPT + }, + { .key = "encryption.data-key-size", + .voltype = "encryption/crypt", + .op_version = 3, + .flags = OPT_FLAG_CLIENT_OPT + }, + { .key = "encryption.block-size", + .voltype = "encryption/crypt", + .op_version = 3, + .flags = OPT_FLAG_CLIENT_OPT + }, + /* Client xlator options */ { .key = "network.frame-timeout", .voltype = "protocol/client", |