| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This patch address post-merge review comments for commit
5784a00f997212d34bd52b2303e20c097240d91c
Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac
fixes: bz#1697930
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: bug-1650403.t && bug-858215.t are throwing error
at the time of access glustershd pidfile
Solution: Use ps command to findout glustershd pid
Change-Id: I3477345b6220aa039e012e674cba21d741e9abab
fixes: bz#1697486
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
| |
fixes: bz#1699176
credits: Hari Gowtham <hgowtham@redhat.com>
Change-Id: I59134336febf0dc4043483f2f413ac83e3bc79f5
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
| |
Just to make all files will be listed, which means we have max code-coverage
updates: bz#1693692
Change-Id: I11d36ac2f4d6d4fb91223aacd423ad23242eb454
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
| |
As the same functionality is covered in glusterd_volinfo_find
Updates: bz#1193929
Change-Id: I2308c5fa9b2ca9edaa95f172d0bd914103808c36
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a gluster node in trusted storage pool has failed
due to hardware issues, volume delete operation fails
saying "Not all peers are up" and peer detach for failed
node fails saying "Brick(s) with peer <peer_ip> exists
in cluster".
The idea here is to use either replace-brick or remove-brick
command to remove all the bricks hosted by failed node and
then re-attempting the peer detach. This change adds this
trick in peer detach error message.
fixes: bz#1697866
Change-Id: I0c58887479d31db603ad8d6535ea9d547880ccc8
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On RHEL-6 there is no support for SEEK_HOLE/SEEK_DATA and this causes
the POSIX xlator to return errno=EINVAL. Because of this, the rpc-server
xlator will log all 'failed' seek attempts. When applications call
seek() often, the brick logs can grow very quickly and fill up the
disks.
Messages that get logged are like
[server-rpc-fops.c:2091:server_seek_cbk] 0-vol01-server: 4947: SEEK-2 (53920aee-062c-4598-aa50-2b4d7821b204), client: worker.example.com-7808-2019/02/08-18:04:57:903430-vol01-client-0-0-0, error-xlator: vol01-posix [Invalid argument]
The problem can be reproduced by running a Gluster Server on RHEL-6,
with a client running on RHEL-7. The client should execute an
application that calls lseek() with SEEK_HOLE/SEEK_DATA.
Change-Id: I7b6c16f8e0ba1a183e845cfdb8d5a3f8caeab138
Fixes: bz#1697316
Signed-off-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a race condition in rpc_transport later
and client fini.
Sequence of events to happen the race condition
1) When we want to destroy a graph, we send a parent down
event first
2) Once parent down received on a client xlator, we will
initiates a rpc disconnect
3) This will in turn generates a child down event.
4) When we process child down, we first do fini for
Every xlator
5) On successful return of fini, we delete the graph
Here after the step 5, there is a chance that the fini
on client might not be finished. Because an rpc_tranpsort
ref can race with the above sequence.
So we have to wait till all rpc's are successfully freed
before returning the fini from client
Change-Id: I20145662d71fb837e448a4d3210d1fcb2855f2d4
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch contains the following changes:
1) Store ID info will now be stored in the inode ctx
2) Added new readv type where read is made directly
from the remote store. This choice is made by
volume set operation.
3) cs_forget() was missing. Added it.
Change-Id: Ie3232b3d7ffb5313a03f011b0553b19793eedfa2
fixes: bz#1642168
Signed-off-by: Anuradha Talur <atalur@commvault.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The helper funcion get_fd_count() returns how many open fd's has a given
gfid on a brick. It could happen that the brick doesn't have information
about that inode because it has not been previously accessed.
Before this patch, the function returned "" when the inode was not
present. This caused basic/ec/ec-fix-openfd.t test to fail because it
was expecting '0' as the result.
This patch forces get_fd_count() to return '0' when the gfid is not
present in the state dump.
Change-Id: I848b57744e96656bf81fbb7b126a5faf44e535eb
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Main changes include logic to update iatt buf
with file size from extended attributes in posix
rather than having this logic in cloudsync xlator.
Change-Id: I44f5f8df7a01e496372557fe2f4eff368dbdaa33
fixes: bz#1642168
Signed-off-by: Anuradha Talur <atalur@commvault.com>
|
|
|
|
|
|
|
|
|
|
| |
1) The placement of cloudsync xlator has been changed
to make it shard xlator's child. If cloudsync has to
work with shard in the graph, it needs to be child of shard.
Change-Id: Ib55424fdcb7ce8edae9f19b8a6e3d3ba86c1f0c4
fixes: bz#1642168
Signed-off-by: Anuradha Talur <atalur@commvault.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As protocol implements every fop, and in general a large part of
the codebase. Considering our regression is run mostly in 1 machine,
there was no way of forcing the client to use old protocol (while new
one is available). With this patch, a new 'testing' option is provided
which forces client to use old protocol if found.
This should help increase the code coverage by at least 10k lines overall.
updates: bz#1693692
Change-Id: Ie45256f7dea250671b689c72b4b6f25037cef948
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Test ec-cpu-extensions.t has been modified so that it uses a bigger
matrix. This makes use of more functions from ec-code-c.c. Changing
read-policy to round-robin increases even more the functions used,
reaching 100% of line and function coverage for this file.
Change-Id: I26e4d33269cbd67f5d76d862f4cf1e69285e85e1
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
| |
this test alone covers most of code of trace xlator
updates: bz#1693692
Change-Id: I287c72ee89bd1c02d992b020d5644e8dac0b77ab
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterfs build is throwing error undefined
reference to `dlclose' on RHEL 6
Solution: Add LIB_DL link in Makefile.am to resolve the same
Fixes: bz#1696512
Change-Id: I58019ca9e29d569d8e6df282b8ab178ad540843b
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Considering ctime is a client side feature, we can't blindly load ctime
xlator into the client graph if it's explicitly turned off, that'd
result into backward compatibility issue where an old client can't mount
a volume configured on a server which is having ctime feature.
Fixes: bz#1697907
Change-Id: I6ae7b96d056073aa6746de9a449cf319786d45cc
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Its value is not going to change within the loop, as far as I can
understand the code.
Fetch and store it outside the loop.
Change-Id: I6327c23212dceec6006349421ef185495892dd8a
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A pattern of following was found in multiple places where both
glusterd_check_volume_exists and glusterd_volinfo_find do the same job.
We just need one of them not both. In a scaled environment having many
volumes this is a bottleneck to iterate over the volume list to find a
volume twice!
exists = glusterd_check_volume_exists(volname);
ret = glusterd_volinfo_find(volname, &volinfo);
if ((ret) || (!exists)) {
Credits: ykaul@redhat.com for finding this out
Updates: bz#1193929
Change-Id: Ie116fe5c93e261a2bddd267c28ccb20a2884a36f
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Setting the pointer to NULL after GF_FREE() and checking the pointer value
before calling GF_FREE() to avoid referencing memory after its has been freed
CID: 1398622
Change-Id: Iba0d8879abccf5923a69132a207d53bb94551417
updates: bz#789278
Signed-off-by: rishubhjain <rishubhjain47@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Part 1: refactor the dht_lookup_dir_cbk
and dht_selfheal_directory functions.
Added a simple dht selfheal directory test
Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9
updates: bz#1590385
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
When split-brain choice is changed from one brick to another
brick, inode-invalidate is not called so readv call is served
from cache leading to failures in split-brain-resolution.t.
Fixed it by calling inode_invaldate() when this happens.
updates bz#1193929
Change-Id: I2624614eec38c0303f3e1dc55dfae3d4b864218b
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For testing the recovery of bad (or corrupted files) in a dispersed
volume, first enable self-heal daemon and let heal happen.
In bitrot feature, if a file becomes corrupted, the solution recommended
is to remove that file directly from the backend and then allowing heal
to happen. Hence turn on self-heal daemon and allow the heal to happen
after removing corrupted copy from the backend.
Change-Id: I7186110398ec1aee7e5727b9d1aac9a01db4d831
fixes: bz#1695327
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we use heal info command, it takes lot of time as in
some cases it takes lock on entries to find out if the
entry actually needs heal or not.
There are some cases where we can avoid these locks and
can conclude if the entry needs heal or not.
1 - We do a lookup (without lock) on an entry, which we found in
.glusterfs/indices/xattrop, and find that lock count is
zero. Now if the file contains dirty bit set on all or any
brick, we can say that this entry needs heal.
2 - If the lock count is one and dirty is greater than 1,
then it also means that some fop had left the dirty bit set
which made the dirty count of current fop (which has taken lock)
more than one. At this point also we can definitely say that
this entry needs heal.
This patch is modifying code to take into consideration above two
points.
It is also changing code to not to call ec_heal_inspect if ec_heal_do
was called from client side heal. Client side heal triggeres heal
only when it is sure that it requires heal.
[We have changed the code to not to call heal for lookup]
updates bz#1689799
Change-Id: I7f09f0ecd12f65a353297aefd57026fd2bebdf9c
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Iec5ce7f17fbf899f881a58cd20c4c967e3b71668
fixes: bz#1642168
Signed-off-by: Anuradha Talur <atalur@commvault.com>
|
|
|
|
|
|
|
|
|
|
|
| |
we have 'sdfs-sanity.t' which covers at least 90% of the functions
and 70% of lines in the translator. But the recent changes to
disable it due to performance impact made even the test to not
consider the translator.
updates: bz#1693692
Change-Id: I0ebcb307c4ab48a6e59ded27bf39f72ce2304ebc
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In release-6 rpc/rpc-lib (libgfrpc) added the function
get_rightmost_set_bit() which calls log2(3), a call that takes
a floating point parameter.
It's used thusly:
right_most_unset_bit = get_rightmost_set_bit(...);
(So is it really the right-most unset bit, or the right-most set bit?)
It's unclear to me whether this is in the data path or not. If it is,
it's rather scary to think about integer-to-float conversions and slow
calls to libm functions in the data path.
gcc and clang have __builtin_ctz() which returns the same result as
get_rightmost_set_bit(), and does it substantially faster. Approx
20M iterations of get_rightmost_set_bit() took ~33sec of wall clock
time on my devel machine, while 20M iterations of __builtin_ctz()
took < 9sec; get_rightmost_set_bit() is 3x slower than __builtin_ctz().
And as a side benefit, we can again eliminate the need to link libgfrpc
with libm.
Change-Id: If9e7e80874577c52223f8125b385fc930de20699
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
| |
Seems to be unused.
Change-Id: I75eed9641dd030a1fbb1b942a9d818f10a7e1437
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
this works as a better solution, as we reuse more functions from library.
Also just do write/read on a file when acl is enabled, so we can see
improvement in code coverage.
updates: bz#1693692
Change-Id: If3359260c8ec2cf4fcf148fb4b95fdecc922c252
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
Change-Id: If3fc0884e7e2f45de2d278b98693b7a473220a5f
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1691616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found missing assignment of lk-owner for an inodelk/entrylk before winding
the fops. locks xlator at the moment allows this operation. This leads to
multiple threads in the same client being able to get locks on the inode
because lk-owner is same and transport is same. So isolation with locks can't
be achieved. To fix it, we need locks xlator change which will disallow
null-lk-owner based inodelk/entrylk/lk. To achieve that we need to first
fix all the places which do this mistake.
updates bz#1624701
Change-Id: Ic3431da3f451a1414f1f4fdcfc4cf41e555f69dd
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
GF_LOG_OCCASSIONALLY doesn't log on the first instance rather at every
42nd iterations which isn't effective as in some cases we might not have
the code flow hitting the same log for as many as 42 times and we'd end
up suppressing the log.
Fixes: bz#1694925
Change-Id: Iee293281d25a652b64df111d59b13de4efce06fa
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
| |
It helps in increased code coverage of playground.
updates: bz#1693692
Change-Id: I81bcf30be1450948a6360d8915f06b973387a560
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Shd daemon is per node, which means they create a graph
with all volumes on it. While this is a great for utilizing
resources, it is so good in terms of performance and managebility.
Because self-heal daemons doesn't have capability to automatically
reconfigure their graphs. So each time when any configurations
changes happens to the volumes(replicate/disperse), we need to restart
shd to bring the changes into the graph.
Because of this all on going heal for all other volumes has to be
stopped in the middle, and need to restart all over again.
Solution:
This changes makes shd as a per volume daemon, so that the graph
will be generated for each volumes.
When we want to start/reconfigure shd for a volume, we first search
for an existing shd running on the node, if there is none, we will
start a new process. If already a daemon is running for shd, then
we will simply detach a graph for a volume and reatach the updated
graph for the volume. This won't touch any of the on going operations
for any other volumes on the shd daemon.
Example of an shd graph when it is per volume
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
--------- --------- ----------
| AFR-1 | | AFR-2 | | AFR-3 |
-------- --------- ----------
A running shd daemon with 3 volumes will be like-->
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
------------ ------------ ------------
| volume-1 | | volume-2 | | volume-3 |
------------ ------------ ------------
Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an open comes on a file when a brick is down and after the brick comes up,
a fop comes on the fd, client xlator would still wind the fop on anon-fd
leading to wrong behavior of the fops in some cases.
Example:
If lk fop is issued on the fd just after the brick is up in the scenario above,
lk fop will be sent on anon-fd instead of failing it on that client xlator.
This lock will never be freed upon close of the fd as flush on anon-fd is
invalid and is not wound below server xlator.
As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag.
Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc
fixes bz#1390914
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixes afr_ta_read_txn() to handle inode refresh failures.
code-path.
- Fixes a double free issue of dict.
Note: This patch address post-merge review comments for commit
69532c141be160b3fea03c1579ae4ac13018dcdf
fixes: bz#1686398
Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In scenarios where a mount fails before creating log file, doesn't
make sense to give message to 'check log file'. See below:
```
ERROR: failed to create logfile "/var/log/glusterfs/mnt.log" (No space left on device)
ERROR: failed to open logfile /var/log/glusterfs/mnt.log
Mount failed. Please check the log file for more details.
```
Fixes: bz#1688068
Change-Id: I1d837caa4f9bc9f1a37780783e95007e01ae4e3f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With commit#ad35193,we have made changes to offload
processing upcall notifications to synctask so as not
to block epoll threads. However seems like the issue wasnt
fully addressed.
In "glfs_cbk_upcall_data" -> "synctask_new1" after creating synctask
if there is no callback defined, the thread waits on synctask_join
till the syncfn is finished. So that way even with those changes,
epoll threads are blocked till the upcalls are processed.
Hence the right fix now is to define a callback function for that
synctask "glfs_cbk_upcall_syncop" so as to unblock epoll/notify threads
completely and the upcall processing can happen in parallel by synctask
threads.
Change-Id: I4d8645e3588fab2c3ca534e0112773aaab68a5dd
fixes: bz#1693575
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively
Worker crash at slave:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
[ESTALE, EINVAL, EBUSY])
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
return call(*arg)
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
cls.raise_oserr()
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory
Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
So while syncing SYMLINK, readlink is done on
master to get target path.
2. Geo-rep will create destination if source is not
present while syncing RENAME. Hence while syncing
RENAME of SYMLINK, target path is collected from
destination.
Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known. In this case,
while creating destination directly at slave, regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.
Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine. If it's renamed to something else,
it will be synced then.
Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fops allocate 3 kind of payload(buffer) in the client xlator:
- fop payload, this is the buffer allocated by the write and put fop
- rsphdr paylod, this is the buffer required by the reply cbk of
some fops like lookup, readdir.
- rsp_paylod, this is the buffer required by the reply cbk of fops like
readv etc.
Currently, in the lookup and readdir fop the rsphdr is sent as payload,
hence the allocated rsphdr buffer is also sent on the wire, increasing
the bandwidth consumption on the wire.
With this patch, the issue is fixed.
Fixes: bz#1692093
Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_disable() and rpc_clnt_disconnect() have same code.
Removed rpc_clnt_disconnect() and moved calls to rpc_clnt_disconnect()
to rpc_clnt_disable()
updates bz#1193929
Change-Id: I965f57cc1d5af36d266810125558b6f5e5f279d4
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
There are two cases to restart brick, one is when glusterd starts or
quorum is met, another is when new peers are joined and quorum is
changes. In the later case, sync_lock is not taken, and may cause lock
corruption.
Change-Id: I0844e7a631350f5ee00bdacb613602bffffcdf9f
fixes: bz#1692612
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
|
|
|
|
|
|
|
|
|
| |
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.
Fixes: bz#1692666
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1 - heal-wait-qlength is by default 128. If shd is disabled
and we need to heal files, client side heal is needed.
If we access these files that will trigger the heal.
However, it has been observed that a file will be enqueued
multiple times in the heal wait queue, which in turn causes
queue to be filled and prevent other files to be enqueued.
2 - While a file is going through healing and a write fop from
mount comes on that file, it sends write on all the bricks including
healing one. At the end it updates version and size on all the
bricks. However, it does not unset dirty flag on all the bricks,
even if this write fop was successful on all the bricks.
After healing completion this dirty flag remain set and never
gets cleaned up if SHD is disabled.
Solution:
1 - If an entry is already in queue or going through heal process,
don't enqueue next client side request to heal the same file.
2 - Unset dirty on all the bricks at the end if fop has succeeded on
all the bricks even if some of the bricks are going through heal.
Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727
updates: bz#1593224
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
| |
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD
updates: bz#1689250
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This api offers the ability to set the pid of a client to a particular
value, identical to how gluster fuse clients provide the --client-pid
option. This is an internal API to be used by gluster processes only. See
https://lists.gluster.org/pipermail/gluster-devel/2019-March/055925.html
for more details. Currently glfsheal is the only proposed consumer.
updates: bz#1689250
Change-Id: I0620be2127d79d69cdd57cffb29bba44e6e5da1f
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tl;dnr: libgfrpc.so calls log2(3) from libm; it should be explicitly
linked with -lm
the autoconf/automake/libtool stack is more or less forgiving on
different distributions. On forgiving systems libtool will semi-
magically link with implicit dependencies. But on Ubuntu, which
seems to be tending toward being less forgiving, the link of libgfrpc
will fail with an unresolved referencee to log2(3).
Change-Id: I9fae09ddb81e49004fbea4d7d83b95fb64a484b0
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I3bbda719027b45e1289db2e6a718627141bcbdc8
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
| |
updates bz#1193929
Change-Id: I01b60d644f517c00a1bcc127bf9a8ed90b6eb7a0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit ensures the following:
1. Don't send commit op request to the remote nodes when gluster v
status all is executed as for the status all transaction the local
commit gets the name of the volumes and remote commit ops are
technically a no-op. So no need for additional rpc requests.
2. In op state machine flow, if the transaction is in staged state and
op_info.skip_locking is true, then no need to set the txn id in the
priv->glusterd_txn_opinfo dictionary which never gets freed.
Fixes: bz#1691164
Change-Id: Ib6a9300ea29633f501abac2ba53fb72ff648c822
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|