| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Fixes: #1223
Change-Id: I36cb72d920ffd77405051546615c5262c392daef
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
(cherry picked from commit b85f01abab658d1d704cd6caf84dd64eddafbff7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.
Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.
Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().
Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.
The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.
Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
(cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes test case is failing at the time of creating files
on mount point after mounting the volume
Solution: After started the volume need to wait to make sure all
bricks instances are completely started so put a online_brick_count
check after just started the volume
Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6
Fixes: #1162
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Do not truncate file offsets and sizes to 32-bit to
prevent tests from spurious failures on >2Gb files.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2
Fixes: #1161
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the output of date command is a single digit
number it is preceded by zero which is getting
considered as an octal number. Removing the leading
zero from the number solved the problem.
Fixes: #1156
Change-Id: Iac4fa20607c0bb90d94dd8ff157ef6b60932c560
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
| |
Parts of the test weren't designed to run in mux mode, this is now fixed
Change-Id: I428c2fcce6d047e324ca5dcaef677ee1794e3dfe
updates: #1154
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t
Problem: Sometime volume status is failed after restart glusterd
in one cluster node
Solution: Wait to finish glusterd handshake on down cluster node
Change-Id: Ib23ca41c943caf2903c61ebf42dc437c1b9d6054
Fixes: #1158
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
| |
This test case includes all the basic glusterfind scenarios.
fixes: #1044
Change-Id: I6021443729e35769fe855c5cc41bb3fbc6365ef0
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The key "GF_PREOP_PARENT_KEY" has been populated by dht and
for non-distribute volume like 1x3 key is not populated so
posix_is_layout stale throw a message while a file is created
Solution: To avoid a log put a condition before delete a key
Change-Id: I813ee7960633e7f9f5e9ad2f42f288053d9eb71f
Fixes: #1150
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
tests/bugs/protocol/bug-1433815-auth-allow.t fails
sometimes because of stale mount. This stale mount
comes into picture when parent process dies without
waiting for the child process which mounts fuse fs
to die
Fix:
Wait for mounting child process to die before dying.
Fixes: #1152
Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When bringing back a downed brick and performing lookup from the client
side, the permission on said brick aren't updated on the first lookup,
but only on the second.
This patch modifies permission update logic so the first lookup will
trigger a permission update on the downed brick.
LIMITATIONS OF THE PATCH:
As the choice of source depends on whether the directory has layout or not.
Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory.
Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted.
fixes: #999
Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Test should wait for process down notification to be received
by glusterd.
Fixes: #1153
Change-Id: I9162b58a92c1a909ca98097f14c0714f9086bdd1
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before executing a fop in POSIX xlator it builds an internal
path based on GFID.To validate the path it call's (l)stat
system call and while .glusterfs is heavily loaded kernel takes
time to lookup inode and due to that performance drops
Solution: In this patch we followed two ways to improve the performance.
1) Keep open fd specific to first level directory(gfid[0])
in .glusterfs, it would force to kernel keep the inodes
from all those files in cache. In case of memory pressure
kernel won't uncache first level inodes. We need to open
256 fd's per brick to access the entry faster.
2) Use at based call's to access relative path to reduce
path based lookup time.
Note: To verify the patch we have executed kernel untar 100 times on 6
different clients after enabling metadata group-cache and some
other option.We were getting more than 20 percent improvement in
kenel untar after applying the patch.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343
updates: #891
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Following tests are done -
1 - After finishing reset-brick all the bricks should be up.
2 - Heal should be completed.
3 - Check number of entries present on brick which was reset.
Change-Id: I9314bed180293a99d400d94bb8cc7ece999da29e
Updates: #1144
|
|
|
|
|
|
| |
Fixes: #1149
Change-Id: I38483fc7d76d7fe0ac9fb649669a46bdf9c82234
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a bug in write-behind that allowed a previous completed write
to overwrite the overlapping region of data from a future write.
Suppose we want to send three writes (W1, W2 and W3). W1 and W2 are
sequential, and W3 writes at the same offset of W2:
W2.offset = W3.offset = W1.offset + W1.size
Both W1 and W2 are sent in parallel. W3 is only sent after W2 completes.
So W3 should *always* overwrite the overlapping part of W2.
Suppose write-behind processes the requests from 2 concurrent threads:
Thread 1 Thread 2
<received W1>
<received W2>
wb_enqueue_tempted(W1)
/* W1 is assigned gen X */
wb_enqueue_tempted(W2)
/* W2 is assigned gen X */
wb_process_queue()
__wb_preprocess_winds()
/* W1 and W2 are sequential and all
* other requisites are met to merge
* both requests. */
__wb_collapse_small_writes(W1, W2)
__wb_fulfill_request(W2)
__wb_pick_unwinds() -> W2
/* In this case, since the request is
* already fulfilled, wb_inode->gen
* is not updated. */
wb_do_unwinds()
STACK_UNWIND(W2)
/* The application has received the
* result of W2, so it can send W3. */
<received W3>
wb_enqueue_tempted(W3)
/* W3 is assigned gen X */
wb_process_queue()
/* Here we have W1 (which contains
* the conflicting W2) and W3 with
* same gen, so they are interpreted
* as concurrent writes that do not
* conflict. */
__wb_pick_winds() -> W3
wb_do_winds()
STACK_WIND(W3)
wb_process_queue()
/* Eventually W1 will be
* ready to be sent */
__wb_pick_winds() -> W1
__wb_pick_unwinds() -> W1
/* Here wb_inode->gen is
* incremented. */
wb_do_unwinds()
STACK_UNWIND(W1)
wb_do_winds()
STACK_WIND(W1)
So, as we can see, W3 is sent before W1, which shouldn't happen.
The problem is that wb_inode->gen is only incremented for requests that
have not been fulfilled but, after a merge, the request is marked as
fulfilled even though it has not been sent to the brick. This allows
that future requests are assigned to the same generation, which could
be internally reordered.
Solution:
Increment wb_inode->gen before any unwind, even if it's for a fulfilled
request.
Special thanks to Stefan Ring for writing a reproducer that has been
crucial to identify the issue.
Change-Id: Id4ab0f294a09aca9a863ecaeef8856474662ab45
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: #884
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
frame is accessed after stack-wind. This can lead to crash
if the cbk frees the frame.
Fix:
Use new frame for the wind instead.
Updates: #832
Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
"snap_scheduler.py init" command failing with the below traceback:
[root@dhcp43-104 ~]# snap_scheduler.py init
Traceback (most recent call last):
File "/usr/sbin/snap_scheduler.py", line 941, in <module>
sys.exit(main(sys.argv[1:]))
File "/usr/sbin/snap_scheduler.py", line 851, in main
initLogger()
File "/usr/sbin/snap_scheduler.py", line 153, in initLogger
logfile = os.path.join(process.stdout.read()[:-1], SCRIPT_NAME + ".log")
File "/usr/lib64/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.6/genericpath.py", line 151, in _check_arg_types
raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components
Solution:
Added the 'universal_newlines' flag to Popen to support backward compatibility.
Added a basic test for snapshot scheduler.
Change-Id: I78e8fabd866fd96638747ecd21d292f5ca074a4e
Fixes: #1134
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...if pending xattrs are zero for all children.
Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the BZ.
Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.
Fixes: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: posix_release(dir) functions add the fd's into a ctx->janitor_fds
and janitor thread closes the fd's.In brick_mux environment it is
difficult to handle race condition in janitor threads because brick
spawns a single janitor thread for all bricks.
Solution: Use synctask to execute posix_release(dir) functions instead of
using background a thread to close fds.
Credits: Pranith Karampuri <pkarampu@redhat.com>
Change-Id: Iffb031f0695a7da83d5a2f6bac8863dad225317e
Fixes: bz#1811631
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If worm-file-level enabled and auto-commit-period 0 an initial write
of a file (e.g. $ echo test >> file1.txt) would lead to an zero byte
file because the WORM xlator immediately WORMed the file when it was
created.
To avoid this we move the setting of trusted.worm_file from
worm_create_cbk to worm_release . This means that this xattr will set
when the filehandle is closed and all initial WRITE FOPs succeed.
Finally we also perform gf_worm_state_transition in worm_release to
ensure that the file will be immediately WORMed after the file handle
was closed.
Change-Id: I5d02e18975b646ca1a27ed41d836e9d0dc333204
Fixes: bz#1808421
Signed-off-by: David Spisla <david.spisla@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When brick-mux is enabled:
i)brick statedumps seem to be listing the same lock information multiple times.
While that is getting fixed, make changes to the .ts to check for unique values.
ii)detecting a brick as online via brick_up_status() seems to be taking
longer time when delaygen is enabled. Hence bump up PROCESS_UP_TIMEOUT to
90 for afr-lock-heal-advanced.t
Updates: #1042
Change-Id: Ife76008f7a99dd1f1fe5791a32577366baaab4b3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.
fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Description of problem:
server.statedump-path is the path where statedumps are stored,
by default it is /var/run/gluster. And can be set to any valid
directory path. It was observed that server.statedump-path was
also accepting file, non-existent file and non-existent paths
as well. And statedump command was successful even when
statedumps with all the invalid paths.
a. A file
b. A non-existent path
Solution:
Added a validation function in gluster-volume-set.c which will
allow volume set to success if it's a valid directory
and in all other cases, volume set should fail.
Fixes: bz#1787122
Change-Id: Ia66e2b3d35f23efc5444c829928779a79d827b42
Signed-off-by: yatipadia <ypadia@redhat.com>
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
1- In a cluster of 3 Nodes N1, N2, N3. Create 3 volumes vol1,
vol2, vol3 with 3 bricks (one from each node)
2- Set cluster.brick-multiplex on
3- Start all 3 volumes
4- Check if all bricks on a node are running on same port
5- Kill N1
6- Set performance.readdir-ahead for volumes vol1, vol2, vol3
7- Bring N1 up and check volume status
8- All bricks processes not running on N1.
Root Cause -
Since, There is a diff in volfile versions in N1 as compared
to N2 and N3 therefore glusterd_import_friend_volume() is called.
glusterd_import_friend_volume() copies the new_volinfo and deletes
old_volinfo and then calls glusterd_start_bricks().
glusterd_start_bricks() looks for the volfiles and sends an rpc
request to glusterfs_handle_attach(). Now, since the volinfo
has been deleted by glusterd_delete_stale_volume()
from priv->volumes list before glusterd_start_bricks() and
glusterd_create_volfiles_and_notify_services() and
glusterd_list_add_order is called after glusterd_start_bricks(),
therefore the attach RPC req gets an empty volfile path
and that causes the brick to crash.
Fix- Call glusterd_list_add_order() and
glusterd_create_volfiles_and_notify_services before
glusterd_start_bricks() cal is made in glusterd_import_friend_volume
Change-Id: Idfe0e8710f7eb77ca3ddfa1cabeb45b2987f41aa
Fixes: bz#1773856
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Adding disperse-data to gluster manual under
volume create command
Change-Id: Ic9eb47c9e71a1d7a11af9394c615c8e90f8d1d69
Fixes: bz#1668239
Signed-off-by: Rishubh Jain <risjain@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.
Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.
fixes: bz#1801624
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Changelog creates threads even if the changelog is not enabled
Background:
Changelog xlator broadly does two things
1. Journalling - Cosumers are geo-rep and glusterfind
2. Event Notification for registered events like (open, release etc) -
Consumers are bitrot, geo-rep
The existing option "changelog.changelog" controls journalling and
there is no option to control event notification and is enabled by
default. So when bitrot/geo-rep is not enabled on the volume, threads
and resources(rpc and rbuf) related to event notifications consumes
resources and cpu cycle which is unnecessary.
Solution:
The solution is to have two different options as below.
1. changelog-notification : Event notifications
2. changelog : Journalling
This patch introduces the option "changelog-notification" which is
not exposed to user. When either bitrot or changelog (journalling)
is enabled, it internally enbales 'changelog-notification'. But
once the 'changelog-notification' is enabled, it will not be disabled
for the life time of the brick process even after bitrot and changelog
is disabled. As of now, rpc resource cleanup has lot of races and is
difficult to cleanup cleanly. If allowed, it leads to memory leaks
and crashes on enable/disable of bitrot or changelog (journal) in a
loop. Hence to be safer, the event notification is not disabled within
lifetime of process once enabled.
Change-Id: Ifd00286e0966049e8eb9f21567fe407cf11bb02a
Updates: #475
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: With lookup-optimize set to on by default, a client with
stale-layout can create a new file on a wrong subvol. This will lead to
possible duplicate files if two different clients attempt to create the
same file with two different layouts.
Solution: Send in-memory layout to be cross checked at posix before
commiting a "create". In case of a mismatch, sync the client layout with
that of the server and attempt the create fop one more time.
test: Manual, testcase(attached)
fixes: bz#1786679
Change-Id: Ife0941f105113f1c572f4363cbcee65e0dd9bd6a
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of signing process threads (glfs_brpobj)
is set to 4 by default. The recommendation is to set
it to number of cores available. This patch makes it
configurable as follows
gluster vol bitrot <volname> signer-threads <count>
fixes: bz#1797869
Change-Id: Ia883b3e5e34e0bc8d095243508d320c9c9c58adc
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
'nfsnobody' user and group is merged with
'nobody' user and group in RHEL8.
Tests are modified to use appropriate user and group.
BUG: 1756900
Change-Id: I59863da2262283b00b1cb417d3652ebe29a36407
Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For files: During metadata heal, we restore timestamps
only for non-regular (char, block etc.) files.
Extenting it for regular files as timestamp is updated
via touch command also
fixes: bz#1787274
Change-Id: I26fe4fb6dff679422ba4698a7f828bf62ca7ca18
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of coming up one server node(1x3) after reboot
client is unmounted.The client is unmounted because a client
is getting AUTH_FAILED event and client call fini for the graph.The
client is getting AUTH_FAILED because brick is not attached with a
graph at that moment
Solution: To avoid the unmounting the client graph throw ENOENT error
from server in case if brick is not attached with server at
the time of authenticate clients.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e
Fixes: bz#1793852
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If non-standard ssh-port is used, Geo-rep can be configured to use ssh port
by using config option, the value should be in allowed port range and non negative.
At present it can accept negative value and outside allowed port range which is incorrect.
Many Linux kernels use the port range 32768 to 61000.
IANA suggests it should be in the range 1 to 2^16 - 1, so keeping the same.
$ gluster volume geo-replication master 127.0.0.1::slave config ssh-port -22
geo-replication config updated successfully
$ gluster volume geo-replication master 127.0.0.1::slave config ssh-port 22222222
geo-replication config updated successfully
This patch fixes the above issue and have added few validations around this
in test cases.
Change-Id: I9875ab3f00d7257370fbac6f5ed4356d2fed3f3c
Fixes: bz#1792276
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There exists a window of 15 sec, where the deletes are picked up
by history crawl when the ignore_deletes is set to true.
And it eventually deletes the file/s from slave which is/are not
supposed to be deleted. Though it is working as per design, a
note regarding this is needed.
Added a warning message indicating the same.
Also logged info when the worker restarts after ignore-deletes
option set.
fixes: bz#1708603
Change-Id: I103be882fac18b4cef935efa355f5037a396f7c1
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: default option itransport.address-family is disappered
in volume info output after a volume reset.
Cause: with 3.8.0 onwards volume option transport.address-family
has default value, any volume which is created will have this
option set. So, volume info will show this in its output. But,
with reset volume, this option is not handled.
Solution: In glusterd_enable_default_options(), we should add this
option along with other default options. This function is called
by glusterd_options_reset() with volume reset command.
fixes: bz#1786478
Change-Id: I58f7aa24cf01f308c4efe6cae748cc3bc8b99b1d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometime fops like posix_writev, posix_fallocate, posix_zerofile
failed and throw error ENOSPC if storage.reserve threshold limit
has reached even fops is overwriting the data
Solution: Retry the fops in case of overwrite if diskspace check
is failed
Credits: kinsu <vpolakis@gmail.com>
Change-Id: I987d73bcf47ed1bb27878df40c39751296e95fe8
Updates: #745
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Consistently fix 'configued' -> 'configured' typo, remove useless
members from 'socket_private_t' and unused 'transport.socket.lowlat'
option. Adjust tests as well.
Change-Id: I285be196457763aec16b184acd26b90623074dec
Updates: bz#1193929
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Imagine the following set of operations:
1. touch $M0/file
2. ln $M0/file $M0/file.lnk
3. rm $M1/file
4. wait for md-cache-timeout
5. stat $M0/file.lnk $M0/file
stat on $M0/file in step 5 succeeds when md-cache-timeout is non-zero
even though it was removed from $M1. The reason is
1. fuse-bridge on $M0 would resolve both file and file.lnk to same
inode. This inode i1 is cached in md-cache.
2. After md-cache-timeout lookup on $M0/file.lnk would be sent to
backend. This lookup will be successful as file.lnk is present on
backend and would refresh the md-cache.
3. The lookup on $M0/file sent within md-cache-timeout after lookup on
$M0/file.lnk would be sent with i1. Since i1 was refreshed by
lookup on $M0/file.lnk, the cache is deemed valid and md-cache
responds lookup on $M0/file as success
To fix this failure we can either:
1. remove lookup on $M0/file.lnk, so that lookup on $M0/file is
actually sent to backend.
2. turn off md-cache
This patch chooses option 2.
credits: Csaba Henk <csaba@redhat.com>
Change-Id: I352c2acd377fe10c4bdf3b6e53c1de86a4e544c7
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Updates: bz#1756900
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
On a freshly installed system non-root geo-rep test case gets blocked.
Solution:
On a freshly installed system, the remote key need to be accepted automatically by ssh-copy-id.
Credits: M. Scherer <mscherer@redhat.com>
Change-Id: I5077f99a6681660f7e3e84c25ef216f521b7c29c
Fixes: bz#1779742
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old command for log rotate is still present removing
it completely. Also adding testcase to test the
log rotate command with both the old as well as the new command
and fixing testcase which use the old syntax to use the new
one.
Code to be removed:
1. In cli-cmd-volume.c from struct cli_cmd volume_cmds[]:
{"volume log rotate <VOLNAME> [BRICK]", cli_cmd_log_rotate_cbk,
"rotate the log file for corresponding volume/brick"
" NOTE: This is an old syntax, will be deprecated from next release."},
2. In cli-cmd-volume.c from cli_cmd_log_rotate_cbk():
||(strcmp("rotate", words[2]) == 0)))
3. In cli-cmd-parser.c from cli_cmd_log_rotate_parse()
if (strcmp("rotate", words[2]) == 0)
volname = (char *)words[3];
else
fixes: bz#1750387
Change-Id: I56e4d295044e8d5fd1fc0d848bc87e135e9e32b4
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When touch is used to create a file, the ctime is not matching
atime and mtime which ideally should match. There is a difference
in nano seconds.
Cause:
When touch is used modify atime or mtime to current time (UTIME_NOW),
the current time is taken from kernel. The ctime gets updated to current
time when atime or mtime is updated. But the current time to update
ctime is taken from utime xlator. Hence the difference in nano seconds.
Fix:
When utimesat uses UTIME_NOW, use the current time from kernel.
fixes: bz#1773530
Change-Id: I9ccfa47dcd39df23396852b4216f1773c49250ce
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: On RHEL-8 ssl test case is failing when trying to
connect with a peer after using the specific cipher.
Solution: If cipher is not supported by openssl on rhel-8 then
test case is failed.To avoid the issue validate the
cipher before connecting with peer.
Change-Id: I96d92d3602cf7fd40337126c8305a3f8925faf9b
Fixes: bz#1756900
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Configure the list of gluster servers in the key
GLUSTERD_BRICK_SERVERS at the time of GETSPEC RPC CALL
and access the value in client side to update volfile
serve list so that client would be able to connect
next volfile server in case of current volfile server
is down
Updates #741
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Change-Id: I23f36ddb92982bb02ffd83937a8bd8a2c97e8104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When one of the node is down in cluster,
rebalance status is not displaying detailed
information.
Cause: In glusterd_volume_rebalance_use_rsp_dict()
we are aggregating rsp from all the nodes into a
dictionary and sending it to cli for printing. While
assigning a index to keys we are considering all the
peers instead of considering only the peers which are
up. Because of which, index is not reaching till 1.
while parsing the rsp cli unable to find status-1
key in dictionary and going out without printing
any information.
Solution: The simplest fix for this without much
code change is to continue to look for other keys
when status-1 key is not found.
fixes: bz#1764119
Change-Id: I0062839933c9706119eb85416256eade97e976dc
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of cleanup rpc object ssl specific data
is not freeing so it has become a leak.
Solution: To avoid the leak cleanup ssl specific data at the
time of cleanup rpc object
Credits: l17zhou <cynthia.zhou@nokia-sbell.com.cn>
Fixes: bz#1768407
Change-Id: I37f598673ae2d7a33c75f39eb8843ccc6dffaaf0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Both read and write tests required writing first. Either just
writing (write test) or write and then read (read test).
So the code is now unified.
2. There's no reason to read zeros from /dev/zero. Just use a
CALLOC'ed buffer.
I don't think we should read and write zeros, but I did not change
the code yet (I think compression and/or dedup will offset results)
It appears neither read-perf nor write-perf were tested, so added
basic tests for them.
Change-Id: I24b1f249fa0335ed652a8982e99c0687d940230e
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements lock healing for gluster-block fencing use case.
If mandatory lock is enabled:
- Add domain lock/unlock to afr_lk fop.
- Maintain a list of locks to be healed in afr_private_t.
- Add lock to the list if afr_lk(F_SETLK or F_SETLKW) was sucessful.
- Remove it from the list during afr_lk(F_UNLCK).
- On child_down, mark lock as needing heal on that child. If lock is
lost on quorum no. of bricks, remove it from the list and mark fd bad.
- For fds marked as bad, fail the subsequent fd based fops.
- On parent up, traverse the list and heal the locks IFF the client is
the lk owner and has quorum. (shd does not heal any locks).
updates: #613
Change-Id: I03c46ceaea30f5e6236d5ec13f71d843d827f1bc
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a rhel-8 machine, we need to have a key length
greater than or eaual to 2048. So changing the values
to 2048 to pass the test.
Credits: Mohit Agrawal <moagrawal@redhat.com>
Change-Id: I0f21db4d737203d0b2e44e7e61f50ae1279795ad
Updates: bz#1756900
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
Export of env variable is required for ssh-copy-id command.
fixes: bz#1765426
Change-Id: Icaf7a848cb8f4ae9f887d885a8c5bb71f26633b4
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|