glusterfs.git/xlators/cluster/ec/src/ec-common.c, branch release-5

cluster/ec: Fix reopen flags to avoid misbehavior

2020-02-25T07:04:17+00:00

Problem:
when a file needs to be re-opened O_APPEND and O_EXCL
flags are not filtered in EC.

- O_APPEND should be filtered because EC doesn't send O_APPEND below EC for
open to make sure writes happen on the individual fragments instead of at the
end of the file.

- O_EXCL should be filtered because shd could have created the file so even
when file exists open should succeed

- O_CREAT should be filtered because open happens with gfid as parameter. So
open fop will create just the gfid which will lead to problems.

Fix:
Filter out these two flags in reopen.

Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4
Fixes: bz#1805055
Signed-off-by: Pranith Kumar K

cluster/ec: Always read from good-mask

2020-02-25T06:48:11+00:00

There are cases where fop->mask may have fop->healing added
and readv shouldn't be wound on fop->healing. To avoid this
always wind readv to lock->good_mask

fixes: bz#1805054
Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87
Signed-off-by: Pranith Kumar K

cluster/ec: inherit healing from lock when it has info

2020-02-25T06:47:20+00:00

If lock has info, fop should inherit healing mask from it.
Otherwise, fop cannot inherit right healing when changed_flags is zero.

Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f
fixes: bz#1805051
Signed-off-by: Kinglong Mee

cluster/ec: Prevent double pre-op xattrops

2020-02-24T07:55:55+00:00

Problem:
Race:
Thread-1                                    Thread-2
1) Does ec_get_size_version() to perform
pre-op fxattrop as part of write-1
                                           2) Calls ec_set_dirty_flag() in
                                              ec_get_size_version() for write-2.
					      This sets dirty[] to 1
3) Completes executing
ec_prepare_update_cbk leading to
ctx->dirty[] = '1'
					   4) Takes LOCK(inode->lock) to check if there are
					      any flags and sets dirty-flag because
				              lock->waiting_flag is 0 now. This leads to
					      fxattrop to increment on-disk dirty[] to '2'

At the end of the writes the file will be marked for heal even when it doesn't need heal.

Fix:
Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked
as '1' in step 2) above

Fixes: bz#1805050
Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865
Signed-off-by: Pranith Kumar K

cluster/ec: Reopen shouldn't happen with O_TRUNC

2020-02-24T07:54:22+00:00

Problem:
Doing re-open with O_TRUNC will truncate the fragment even when it is not
needed needing extra heals

Fix:
At the time of re-open don't use O_TRUNC.

fixes: bz#1805049
Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8
Signed-off-by: Pranith Kumar K

cluster/ec: fix fd reopen

2020-02-20T08:26:40+00:00

Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1805047
Signed-off-by: Xavi Hernandez

cluster/ec: honor contention notifications for partially acquired locks

2019-06-28T11:07:08+00:00

EC was ignoring lock contention notifications received while a lock was
being acquired. When a lock is partially acquired (some bricks have
granted the lock but some others not yet) we can receive notifications
from acquired bricks, which should be honored, since we may not receive
more notifications after that.

Since EC was ignoring them, once the lock was acquired, it was not
released until the eager-lock timeout, causing unnecessary delays on
other clients.

This fix takes into consideration the notifications received before
having completed the full lock acquisition. After that, the lock will
be releaed as soon as possible.

Backport of:
> BUG: bz#1708156
> Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
> Signed-off-by: Xavi Hernandez 

Fixes: bz#1717282
Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
Signed-off-by: Xavi Hernandez 
Signed-off-by: Hari Gowtham

Land part 2 of clang-format changes

2018-09-12T12:22:45+00:00

Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu

cluster/ec: Don't update trusted.ec.version if fop succeeds

2018-09-07T07:29:39+00:00

If a fop has succeeded on all the bricks and trying to release
the lock, there is no need to update the version for the
file/entry. All it will do is to increase the version from
x to x+1 on all the bricks.

If this update (x to x+1) fails on some brick, this will indicate
that the entry is unhealthy while in realty everything is fine
with the entry.

Avoiding this update will help to not to send one xattrop
at the end of the fops. Which will decrease the chances
of entries being in unhealthy state and also improve the
performance.

Change-Id: Id9fca6bd2991425db6ed7d1f36af27027accb636
fixes: bz#1623759
Signed-off-by: Ashish Pandey

cluster/ec: Improve logging for some critical error messages

2018-09-07T07:23:21+00:00

Change-Id: I037e52a3467467b81a1ba5416317870864060d4d
updates: bz#1615703
Signed-off-by: Ashish Pandey