glusterfs.git/xlators/cluster, branch v6.4

[RFC] change get_real_filename implementation to use ENOATTR instead of ENOENT

2019-07-04T05:53:23+00:00

get_real_filename is implemented as a virtual extended attribute to help
Samba implement the case-insensitive but case preserving SMB protocol
more efficiently. It is implemented as a getxattr call on the parent directory
with the virtual key of "get_real_filename:" by looking for a
spelling with different case for the provided file/dir name ()
and returning this correct spelling as a result if the entry is found.
Originally (05aaec645a6262d431486eb5ac7cd702646cfcfb), the
implementation used the ENOENT errno to return the authoritative answer
that  does not exist in any case folding.

Now this implementation is actually a violation or misuse of the defined
API for the getxattr call which returns ENOENT for the case that the dir
that the call is made against does not exist and ENOATTR (or the synonym
ENODATA) for the case that the xattr does not exist.

This was not a problem until the gluster fuse-bridge was changed
to do map ENOENT to ESTALE in 59629f1da9dca670d5dcc6425f7f89b3e96b46bf,
after which we the getxattr call for get_real_filename returned an
ESTALE instead of ENOENT breaking the expectation in Samba.

It is an independent problem that ESTALE should not leak out to user
space but is intended to trigger retries between fuse and gluster.
But nevertheless, the semantics seem to be incorrect here and should
be changed.

This patch changes the implementation of the get_real_filename virtual
xattr to correctly return ENOATTR instead of ENOENT if the file/directory
being looked up is not found.

The Samba glusterfs_fuse vfs module which takes advantage of the
get_real_filename over a fuse mount will receive a corresponding change
to map ENOATTR to ENOENT. Without this change, it will still work
correctly, but the performance optimization for nonexisting files is
lost. On the other hand side, this change removes the distinction
between the old not-implemented case and the implemented case.
So Samba changed to treat ENOATTR like ENOENT will not work correctly
any more against old servers that don't implement get_real_filename.
I.e. existing files will be reported as non-existing

Backport of:
> Change-Id: I971b427ab8410636d5d201157d9af70e0d075b67
> fixes: bz#1722977
> Signed-off-by: Michael Adam 

Change-Id: I971b427ab8410636d5d201157d9af70e0d075b67
fixes: bz#1723659
Signed-off-by: Michael Adam 
(cherry picked from commit dc1b87fcfef08c9497b0c02b2410c9d18bbc2dba)

cluster/dht: Fixed a memleak in dht_rename_cbk

2019-07-03T06:36:46+00:00

Fixed a memleak in dht_rename_cbk when creating
a linkto file.

>Change-Id: I705adef3cb79e33806520fc2b15558e90e2c211c
>fixes: bz#1722698

Change-Id: I705adef3cb79e33806520fc2b15558e90e2c211c
fixes: bz#1726294
Signed-off-by: N Balachandran 
(cherry picked from commit 532b0fc8b1ace9ad48fdaf643e0b1a34020b6cd8)

cluster/ec: honor contention notifications for partially acquired locks

2019-06-03T04:08:06+00:00

EC was ignoring lock contention notifications received while a lock was
being acquired. When a lock is partially acquired (some bricks have
granted the lock but some others not yet) we can receive notifications
from acquired bricks, which should be honored, since we may not receive
more notifications after that.

Since EC was ignoring them, once the lock was acquired, it was not
released until the eager-lock timeout, causing unnecessary delays on
other clients.

This fix takes into consideration the notifications received before
having completed the full lock acquisition. After that, the lock will
be releaed as soon as possible.

Backport of:
> BUG: bz#1708156
> Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
> Signed-off-by: Xavi Hernandez 

Fixes: bz#1714172
Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
Signed-off-by: Xavi Hernandez

cluster/ec: Reopen shouldn't happen with O_TRUNC

2019-05-15T10:36:50+00:00

Problem:
Doing re-open with O_TRUNC will truncate the fragment even when it is not
needed needing extra heals

Fix:
At the time of re-open don't use O_TRUNC.

fixes bz#1709660
Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8
Signed-off-by: Pranith Kumar K

afr: thin-arbiter lock release fixes

2019-05-15T04:16:52+00:00

- pass fop state instead of afr local to
afr_ta_dom_lock_check_and_release()

- avoid afr_lock_release_synctask() being called simultaneosuly from
notify code path and transaction (post-op) code path due to races.

- Check if the post-op on TA is valid based on event_gen checks.

- Invalidate in-memory information when we get TA child down.

Note: Thi patch addresses some pending review comments of commit
053b1309dc8fbc05fcde5223e734da9f694cf5cc
(https://review.gluster.org/#/c/glusterfs/+/20095/)

fixes: bz#1709130
Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c
Signed-off-by: Ravishankar N 
(cherry picked from commit 9ab2747da78061882f6734df4b265bce11adaef1)

cluster/afr : TA: Return actual error code in case of failure

2019-05-13T05:36:51+00:00

In afr_ta_post_op_do, we were sending EIO for every failure.
However, the original error code should be sent.

Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a
updates: bz#1709143
Signed-off-by: Ashish Pandey 
(cherry picked from commit 63159cdb5374f458d7d2bffec24d4720ffc96d6c)

cluster/dht: Refactor dht lookup functions

2019-05-09T12:10:25+00:00

Part 2: Modify dht_revalidate_cbk to call
dht_selfheal_directory instead of separate calls
to heal attrs and xattrs.

Change-Id: Id41ac6c4220c2c35484812bbfc6157fc3c86b142
fixes: bz#1707393
Signed-off-by: N Balachandran

cluster/dht: refactor dht lookup functions

2019-05-08T14:00:05+00:00

Part 1:  refactor the dht_lookup_dir_cbk
and dht_selfheal_directory functions.
Added a simple dht selfheal directory test

Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9
updates: bz#1707393
Signed-off-by: N Balachandran

cluster/ec: fix fd reopen

2019-05-08T13:54:59+00:00

Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
> Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
> BUG: bz#1699866
> Signed-off-by: Xavi Hernandez 

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez

cluster/afr: Remove local from owners_list on failure of lock-acquisition

2019-04-16T11:29:03+00:00

When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.

fixes bz#1699731
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K