glusterfs.git/tests/basic/afr, branch v8.1

cluster/afr: Delay post-op for fsync

2020-07-28T13:21:05+00:00

Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K

afr: event gen changes

2020-04-24T12:50:13+00:00

The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N 
Reported-by: Erik Jacobson 
(cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)

cluster/afr: Fixes for halo

2020-03-13T13:20:37+00:00

Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.

fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K

cluster/afr: Add afr_seek to fops table

2019-10-14T11:26:56+00:00

fixes: bz#1760189
Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d
Signed-off-by: Pranith Kumar K

ctime/rebalance: Heal ctime xattr on directory during rebalance

2019-09-16T06:42:29+00:00

After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1734026
Signed-off-by: Kotresh HR

posix/ctime: Fix ctime upgrade issue

2019-06-21T11:09:32+00:00

Problem:
On a EC volume, during upgrade from the older version where
ctime feature is not enabled(or not present) to the newer
version where the ctime feature is available (enabled default),
the self heal hangs and doesn't complete.

Cause:
The ctime feature has both client side code (utime) and
server side code (posix). The feature is driven from client.
Only if the client side sets the time in the frame, should
the server side sets the time attributes in xattr. But posix
setattr/fseattr was not doing that. When one of the server
nodes is updated, since ctime is enabled by default, it
starts setting xattr on setattr/fseattr on the updated node/brick.

On a EC volume the first two updated nodes(bricks) are not a
problem because there are 4 other bricks with consistent data.
However once the third brick is updated, the new attribute(mdata xattr)
will cause an inconsistency on metadata on 3 bricks, which
prevents the file to be repaired.

Fix:
Don't create mdata xattr with utimes/utimensat system call.
Only update if already present.

Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
fixes: bz#1720201
Signed-off-by: Kotresh HR

tests: Fix split-brain-favorite-child-policy.t failure

2019-06-10T18:13:26+00:00

Problem:
The test case is failing to heal the volume within $HEAL_TIMEOUT @195.
This is happening because as part of split-brain resolution the file
gets expunged from the sink and the new entry mark for that file will
be done on the source bricks as part of impunging. Since the source
bricks shd-threads failed to get the heal-domain lock, they will wait
for the heal-timeout of 10 minutes, which is greater than $HEAL_TIMEOUT.

Fix:
Set the cluster.heal-timeout to 5 seconds to trigger the heal so that
one of the source brick heals the file within the $HEAL_TIMEOUT.

Change-Id: Ie73c578cc5361c0d617a48ccc86026734d20ba8c
fixes: bz#1718998
Signed-off-by: karthik-us

tests: Fix spurious failures in ta-write-on-bad-brick.t

2019-05-24T15:00:28+00:00

Problem:
afr_child_up_status_meta works only when LOOKUP on $M0 is successful.
There are cases where quorum is not met and LOOKUP fails on $M0 which
leads to failures similar to:
grep: /mnt/glusterfs/0/.meta/graphs/active/patchy-replicate-0/private: Transport endpoint is not connected
This was happening once in a while based on attribute-timeout and
md-cache not serving the lookup.

Fix:
Find child-up status based on statedump instead. Also changed mount
options to include --entry-timeout=0 and --attribute-timeout=0

updates bz#1193929
Change-Id: Ic0de72c3006d7399a5feb3e4d10d4748949b2ab3
Signed-off-by: Pranith Kumar K

tests: improve and fix some test scripts

2019-05-09T09:24:10+00:00

Change-Id: Iceefe22af754096c599dc570d4894d14fce4deae
Updates: bz#1193929
Signed-off-by: Xavier Hernandez

cluster/afr: Invalidate inode on change of split-brain-choice

2019-04-05T09:48:50+00:00

When split-brain choice is changed from one brick to another
brick, inode-invalidate is not called so readv call is served
from cache leading to failures in split-brain-resolution.t.
Fixed it by calling inode_invaldate() when this happens.

updates bz#1193929
Change-Id: I2624614eec38c0303f3e1dc55dfae3d4b864218b
Signed-off-by: Pranith Kumar K