glusterfs.git/tests/basic, branch v6.10

afr: event gen changes

2020-07-08T02:19:40+00:00

The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N 
Reported-by: Erik Jacobson 
(cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)

tests: skip tests on absence of reflink in xfs

2020-05-26T09:36:12+00:00

Fixes: #1223
Change-Id: I36cb72d920ffd77405051546615c5262c392daef
Signed-off-by: Pranith Kumar K 
(cherry picked from commit b85f01abab658d1d704cd6caf84dd64eddafbff7)

mount/fuse: Wait for 'mount' child to exit before dying

2020-04-22T05:21:08+00:00

Problem:
tests/bugs/protocol/bug-1433815-auth-allow.t fails
sometimes because of stale mount. This stale mount
comes into picture when parent process dies without
waiting for the child process which mounts fuse fs
to die

Fix:
Wait for mounting child process to die before dying.

Fixes: #1152
Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc
Signed-off-by: Pranith Kumar K

snap_scheduler: python3 compatibility and new test case

2020-04-20T07:50:18+00:00

Problem:
"snap_scheduler.py init" command failing with the below traceback:

[root@dhcp43-104 ~]# snap_scheduler.py init
Traceback (most recent call last):
  File "/usr/sbin/snap_scheduler.py", line 941, in 
    sys.exit(main(sys.argv[1:]))
  File "/usr/sbin/snap_scheduler.py", line 851, in main
    initLogger()
  File "/usr/sbin/snap_scheduler.py", line 153, in initLogger
    logfile = os.path.join(process.stdout.read()[:-1], SCRIPT_NAME + ".log")
  File "/usr/lib64/python3.6/posixpath.py", line 94, in join
    genericpath._check_arg_types('join', a, *p)
  File "/usr/lib64/python3.6/genericpath.py", line 151, in _check_arg_types
    raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components

Solution:

Added the 'universal_newlines' flag to Popen to support backward compatibility.

Added a basic test for snapshot scheduler.

Backport of:

   >Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/24257/
   >Change-Id: I78e8fabd866fd96638747ecd21d292f5ca074a4e
   >Fixes: #1134
   >Signed-off-by: Sunny Kumar 
   >(cherry picked from commit a7d7ec066e56ac03bf252c26beb20fdc2c3b6772)

Change-Id: I78e8fabd866fd96638747ecd21d292f5ca074a4e
Fixes: #1134
Signed-off-by: Sunny Kumar

cluster/ec: Prevent double pre-op xattrops

2019-10-30T08:18:18+00:00

Problem:
Race:
Thread-1                                    Thread-2
1) Does ec_get_size_version() to perform
pre-op fxattrop as part of write-1
                                           2) Calls ec_set_dirty_flag() in
                                              ec_get_size_version() for write-2.
					      This sets dirty[] to 1
3) Completes executing
ec_prepare_update_cbk leading to
ctx->dirty[] = '1'
					   4) Takes LOCK(inode->lock) to check if there are
					      any flags and sets dirty-flag because
				              lock->waiting_flag is 0 now. This leads to
					      fxattrop to increment on-disk dirty[] to '2'

At the end of the writes the file will be marked for heal even when it doesn't need heal.

Fix:
Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked
as '1' in step 2) above

Updates bz#1739446
Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865
Signed-off-by: Pranith Kumar K

cluster/ec: fix EIO error for concurrent writes on sparse files

2019-10-24T09:25:08+00:00

EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.


Backport of:
 > Patch:https://review.gluster.org/#/c/glusterfs/+/23066/
 > Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
 > BUG: 1730715
 > Signed-off-by: Xavi Hernandez 
 
(cherry picked from commit b01a43586c5abc23a874e5528a063c508f952cbd)

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1739451
Signed-off-by: Xavi Hernandez

ctime/rebalance: Heal ctime xattr on directory during rebalance

2019-09-27T11:34:25+00:00

After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Backport of:
 > Patch: https://review.gluster.org/23127
 > Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
 > BUG: 1734026
 > Signed-off-by: Kotresh HR 

Patch: https://review.gluster.org/23127
Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1752413
Signed-off-by: Kotresh HR

gfapi: 'glfs_h_creat_open' - new API to create handle and open fd

2019-09-27T11:30:47+00:00

Right now we have two separate APIs, one
- 'glfs_h_creat_handle' to create handle & another
- 'glfs_h_open' to create a glfd to return to application

Having two separate routines can result in access errors
while trying to create and write into a read-only file.

Since a fd is opened even during file/directory creation,
introducing a new API to make these two operations atomic i.e,
which can create both handle & fd and pass them to application

This is backport of below mainline patch -
- https://review.gluster.org/#/c/glusterfs/+/23448/
- bz#1753569

Change-Id: Ibf513fcfcdad175f4d7eb6fa7a61b8feec6d33b5
fixes: bz#1755785
Signed-off-by: Soumya Koduri

ctime: Set mdata xattr on legacy files

2019-08-06T07:23:43+00:00

Problem:
The files which were created before ctime enabled would not
have "trusted.glusterfs.mdata"(stores time attributes) xattr.
Upon fops which modifies either ctime or mtime, the xattr
gets created with latest ctime, mtime and atime, which is
incorrect. It should update only the corresponding time
attribute and rest from backend

Solution:
Creating xattr with values from brick is not possible as
each brick of replica set would have different times.
So create the xattr upon successful lookup if the xattr
is not created

Note To Reviewers:
The time attributes used to set xattr is got from successful
lookup. Instead of sending the whole iatt over the wire via
setxattr, a structure called mdata_iatt is sent. The mdata_iatt
contains only time attributes.

Backport of:
 > Patch:  https://review.gluster.org/22936
 > Change-Id: I5e535631ddef04195361ae0364336410a2895dd4
 > BUG: 1593542
 > Signed-off-by: Kotresh HR 

Change-Id: I5e535631ddef04195361ae0364336410a2895dd4
updates: bz#1733885
Signed-off-by: Kotresh HR

posix/ctime: Fix ctime upgrade issue

2019-07-02T07:40:33+00:00

Problem:
On a EC volume, during upgrade from the older version where
ctime feature is not enabled(or not present) to the newer
version where the ctime feature is available (enabled default),
the self heal hangs and doesn't complete.

Cause:
The ctime feature has both client side code (utime) and
server side code (posix). The feature is driven from client.
Only if the client side sets the time in the frame, should
the server side sets the time attributes in xattr. But posix
setattr/fseattr was not doing that. When one of the server
nodes is updated, since ctime is enabled by default, it
starts setting xattr on setattr/fseattr on the updated node/brick.

On a EC volume the first two updated nodes(bricks) are not a
problem because there are 4 other bricks with consistent data.
However once the third brick is updated, the new attribute(mdata xattr)
will cause an inconsistency on metadata on 3 bricks, which
prevents the file to be repaired.

Fix:
Don't create mdata xattr with utimes/utimensat system call.
Only update if already present.

Backport of:
 > Patch: https://review.gluster.org/22858
 > Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
 > BUG: 1720201
 > Signed-off-by: Kotresh HR 

Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
fixes: bz#1722805
Signed-off-by: Kotresh HR