| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`address_family=inet6` needs to be added while mounting master and
slave volumes in gverify script.
New option introduced to gluster cli(`--inet6`) which will be used
internally by geo-rep while calling `gluster volume info
--remote-host=<ipv6>`.
Backport of https://review.gluster.org/22363
Fixes: bz#1695436
Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837
Signed-off-by: Aravinda VK <avishwan@redhat.com>
(cherry picked from commit 240e1d6821fbb779c3dd73f6f0225d755a5b7cc6)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.
fixes bz#1699731
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: commit c34e4161f3cb6539ec83a9020f3d27eb4759a975 set log-level
per xlator during reconfigure only for a brick process not for
the client process.
Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure
about brick multiplex introudce a flag brick_mux at ctx->cmd_args.
Note: There are two other changes done with this patch
1) Ignore client-log-level option to attach a brick with
already running brick if brick_mux is enabled
2) Add a log to print pid of the running process to make easier
debugging
> Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9
> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
> Fixes: bz#1696046
> (Cherry picked from commit 798aadbe51a9a02dd98a0f861cc239ecf7c8ed57)
> (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/22495/)
Change-Id: If91682830f894ab8f6857f19dcb1797fc15ca64c
Fixes: bz#1699715
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Creation of tar file on gluster volume throws warning
'file changed as we read it'
Cause:
During readdirp, for few of the files whose inode is not
present, time attributes were served from backend. This caused
the ctime of few files to be different between before readdir
and after readdir by tar.
Solution:
If ctime feature is enabled and inode is not present, don't
serve the time attributes from backend file, serve it from xattr.
Backport of:
> Patch: https://review.gluster.org/22540
> BUG: 1698078
> Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae
> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit c56f102da21c5b69e656a055aaf736281596284d)
fixes: bz#1699703
Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
ec_truncate_clean does writing under the lock granted for truncate,
but the lock is calculated by ec_adjust_offset_up, so that,
the write in ec_truncate_clean is out of lock.
Updates: bz#1699499
Change-Id: Idbe1fd48d26afe49c36b77db9f12e0907f5a4134
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
(cherry picked from commit 0e1223491e964096384edfae5032ed0d50d028ad)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment, while volumes are stopped in a
loop bricks are not detached successfully. Brick's are not
detached because xprtrefcnt has not become 0 for detached brick.
At the time of initiating brick detach process server_notify
saves xprtrefcnt on detach brick and once counter has become
0 then server_rpc_notify spawn a server_graph_janitor_threads
for cleanup brick resources.xprtrefcnt has not become 0 because
socket framework is not working due to assigning 0 as a fd for socket.
In commit dc25d2c1eeace91669052e3cecc083896e7329b2
there was a change in changelog fini to close htime_fd if htime_fd is not
negative, by default htime_fd is 0 so it close 0 also.
Solution: Initialize htime_fd to -1 after just allocate changelog_priv
by GF_CALLOC
> Fixes: bz#1699025
> Change-Id: I5f7ca62a0eb1c0510c3e9b880d6ab8af8d736a25
> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
> (cherry picked from commit b777d83001d8006420b6c7d2d88fe68950aa7e00)
Change-Id: I7a2b6fc2d36405d51990376333e093661be48475
Fixes: bz#1699714
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterfs build is throwing error undefined
reference to `dlclose' on RHEL 6
Solution: Add LIB_DL link in Makefile.am to resolve the same
> Fixes: bz#1696512
> Change-Id: I58019ca9e29d569d8e6df282b8ab178ad540843b
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> (cherry picked from commit 064aad721c249d63fb89686b32e5d15de50e2f8c)
Change-Id: I4f68553b501c283e2066ddc64e204db40552ee73
Fixes: bz#1699713
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch address post-merge review comments for commit
5784a00f997212d34bd52b2303e20c097240d91c
Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac
fixes: bz#1699319
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an open comes on a file when a brick is down and after the brick comes up,
a fop comes on the fd, client xlator would still wind the fop on anon-fd
leading to wrong behavior of the fops in some cases.
Example:
If lk fop is issued on the fd just after the brick is up in the scenario above,
lk fop will be sent on anon-fd instead of failing it on that client xlator.
This lock will never be freed upon close of the fd as flush on anon-fd is
invalid and is not wound below server xlator.
As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag.
Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc
fixes bz#1699198
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
(cherry picked from commit 92ae26ae8039847e38c738ef98835a14be9d4296)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixes afr_ta_read_txn() to handle inode refresh failures.
code-path.
- Fixes a double free issue of dict.
Note: This patch address post-merge review comments for commit
69532c141be160b3fea03c1579ae4ac13018dcdf
fixes: bz#1693992
Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 500bd0014128e6727e83b6cb77e8ac94304b8f4a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1 - heal-wait-qlength is by default 128. If shd is disabled
and we need to heal files, client side heal is needed.
If we access these files that will trigger the heal.
However, it has been observed that a file will be enqueued
multiple times in the heal wait queue, which in turn causes
queue to be filled and prevent other files to be enqueued.
2 - While a file is going through healing and a write fop from
mount comes on that file, it sends write on all the bricks including
healing one. At the end it updates version and size on all the
bricks. However, it does not unset dirty flag on all the bricks,
even if this write fop was successful on all the bricks.
After healing completion this dirty flag remain set and never
gets cleaned up if SHD is disabled.
Solution:
1 - If an entry is already in queue or going through heal process,
don't enqueue next client side request to heal the same file.
2 - Unset dirty on all the bricks at the end if fop has succeeded on
all the bricks even if some of the bricks are going through heal.
Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727
updates: bz#1693223
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit 313dcefe7a62bd16cd794040df068f9bec9c6927)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Considering ctime is a client side feature, we can't blindly load ctime
xlator into the client graph if it's explicitly turned off, that'd
result into backward compatibility issue where an old client can't mount
a volume configured on a server which is having ctime feature.
Fixes: bz#1698471
Change-Id: I6ae7b96d056073aa6746de9a449cf319786d45cc
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit efbf8abcc3bc729a90d4a7b57dc515f1df8a5863)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit ensures the following:
1. Don't send commit op request to the remote nodes when gluster v
status all is executed as for the status all transaction the local
commit gets the name of the volumes and remote commit ops are
technically a no-op. So no need for additional rpc requests.
2. In op state machine flow, if the transaction is in staged state and
op_info.skip_locking is true, then no need to set the txn id in the
priv->glusterd_txn_opinfo dictionary which never gets freed.
Fixes: bz#1694610
Change-Id: Ib6a9300ea29633f501abac2ba53fb72ff648c822
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 34e010d64905b7387de57840d3fb16a326853c9b)
|
|
|
|
|
|
|
|
|
|
| |
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD
updates: bz#1693155
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 8016d51a3bbd410b0b927ed66be50a09574b7982)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we use heal info command, it takes lot of time as in
some cases it takes lock on entries to find out if the
entry actually needs heal or not.
There are some cases where we can avoid these locks and
can conclude if the entry needs heal or not.
1 - We do a lookup (without lock) on an entry, which we found in
.glusterfs/indices/xattrop, and find that lock count is
zero. Now if the file contains dirty bit set on all or any
brick, we can say that this entry needs heal.
2 - If the lock count is one and dirty is greater than 1,
then it also means that some fop had left the dirty bit set
which made the dirty count of current fop (which has taken lock)
more than one. At this point also we can definitely say that
this entry needs heal.
This patch is modifying code to take into consideration above two
points.
It is also changing code to not to call ec_heal_inspect if ec_heal_do
was called from client side heal. Client side heal triggeres heal
only when it is sure that it requires heal.
[We have changed the code to not to call heal for lookup]
updates bz#1697764
Change-Id: I7f09f0ecd12f65a353297aefd57026fd2bebdf9c
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit da47caf2405c08c9abafc4a55525a8b2c2dd5bb8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fops allocate 3 kind of payload(buffer) in the client xlator:
- fop payload, this is the buffer allocated by the write and put fop
- rsphdr paylod, this is the buffer required by the reply cbk of
some fops like lookup, readdir.
- rsp_paylod, this is the buffer required by the reply cbk of fops like
readv etc.
Currently, in the lookup and readdir fop the rsphdr is sent as payload,
hence the allocated rsphdr buffer is also sent on the wire, increasing
the bandwidth consumption on the wire.
With this patch, the issue is fixed.
Fixes: bz#1692101
Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1399758 Dereference before null check
It was introduced @ commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008
updates: bz#1691187
> updates: bz#789278
> Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
> Change-Id: I1424b008b240691fe2a8924e31c708d0fb4f362d
> (cherry picked from commit 8aff9cc5c6277ef7dacfb89f1392b7c2eda9b825)
Change-Id: Ie2160fb9ae9cdeacf845e849da7f6001b3b6b10b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Brick is getting crash because graph was not activated
at the time of accessing server_conf
Solution: To avoid the crash check ctx->active before processing
a request
> Change-Id: Ib112e0eace19189e45f430abdac5511c026bed47
> fixes: bz#1687705
>(cherry picked from commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008)
> (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/22339/)
Change-Id: I1367c564f04edbad145575b811c67522cc318851
fixes: bz#1688218
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussion on thin arbiter volume -
https://github.com/gluster/glusterfs/issues/352#issuecomment-350981148
Main idea of having this rpm package is to deploy thin-arbiter
without glusterd and other commands on a node, and all we need
on that tie-breaker node is to run a single glusterfs command.
Also note that, no other glusterfs installation needs
thin-arbiter.so.
Make sure RPM contains sample vol file, which can work by default,
and a script to configure that volfile, along with translator image.
Change-Id: Ibace758373d8a991b6a19b2ecc60c93b2f8fc489
updates: bz#1672818
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit ca9bef7f1538beb570fcb190ff94f86f0b8ba38a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.
Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.
Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1687672
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd has memory leak while running "gluster v profile"
in a loop
Solution: Resolve leak code path to avoid leak
> Change-Id: Id608703ff6d0ad34ed8f921a5d25544e24cfadcd
> fixes: bz#1685414
> (Cherry pick from commit 9374484917466dff4688d96ff7faa0de1c804a6c)
> (Reviewed on link https://review.gluster.org/#/c/glusterfs/+/22301/)
Change-Id: I1ca118265f97b188f94b3d5cff649ec36cb18ca0
fixes: bz#1685771
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: commit 5a152a changed the mechansim of computing the
checksum. In heterogeneous cluster, peers are running into
rejected state because we have different cksum computation
mechansims in upgraded and non-upgraded nodes.
Solution: add a check for op-version so that all the nodes
in the cluster follow the same mechanism for computing the
cksum.
fixes: bz#1684029
> Change-Id: I1508f000e8c9895588b6011b8b6cc0eda7102193
> BUG: bz#1685120
> Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
> (cherry picked from commit 073444b693b7a91c42963512e0fdafb57ad46670)
Change-Id: I1508f000e8c9895588b6011b8b6cc0eda7102193
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This deadlock happens while processing dentry corresponding to current
directory (.) in rda_fill_readdirp. In this case following order is
followed:
LOCK(directory_fd_ctx->lock);
rda_inode_ctx_get_iatt -> LOCK(directory_inode->lock);
However, in rda_mark_inode_dirty following lock order is followed:
LOCK(directory_inode->lock);
LOCK(directory_fd_ctx->lock);
these two codepaths when executed concurrently resulted in a deadlock.
Current patch fixes this by removing locking directory inode and
fd-ctx in rda_fill_readdirp. This is fine as directory inode's stat
won't change due to writes to files within directory.
Change-Id: Ic93a67a0dac8229bb0d79582e526a512e6f2569c
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Fixes: bz#1686399
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was 30% regression observed in mkdir path with commit
b139bc58eb504adf5ef81658896c9283ae21f390. On analysis it is found
that io-threads xlator deprioritzes fops with all -ve pid.
Some context in to the no-root-squash pid requirement:
DHT xlator does some of the internal fops with root privileges. This is
needed so that operations like layout healing should not be abandoned
because a non root user is operating. If root-squash option is enabled
the layout set operation looses its root privilege as server xlator
converts the uid and pid to random numbers. Hence, the above mentioned
commit converted pid to GF_CLIENT_PID_NO_ROOT_SQUASH to continue fops
as root.
Combining the above I am proposing not to deprioritize fops with
no-root-squash pid.
> Change-Id: I54d056c01b25729304a77f9242fbaff39c5672ba
> fixes: bz#1676430
> Signed-off-by: Susant Palai <spalai@redhat.com>
(cherry picked from commit f5c3b1727f55ffaa3dcdb3c3a09b968ebb45dbb2)
Change-Id: I54d056c01b25729304a77f9242fbaff39c5672ba
fixes: bz#1676429
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
experimental xlators have been removed from the codebase. But we
missed to remove the options related to experimental xlators from
the codebase. This patch removes those options.
fixes: bz#1683506
Change-Id: I3fa7e14c6cd8ebde5cebc8d2b0cb2409bf37c1ae
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
(cherry picked from commit 5cddd4d758014fe116d9c130632eada2ecded88c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "struct iatt" in iatt.h is using int64_t types for storing
the atime, mtime and ctime. Therefore the struct 'struct md_cache' in
md-cache.c should also use this types to avoid an integer overflow.
This can happen e.g. if someone uses a very high default-retention-period
in the WORM-Xlator.
Change-Id: I605268d300ab622b9c8ab30e459dc00d9340aad1
fixes: bz#1680020
Signed-off-by: David Spisla <david.spisla@iternity.com>
(cherry picked from commit 15423e14f16dd1a15ee5e5cbbdbdd370e57ed59f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If parallel-readdir is enabled, the rda xlator is loaded
below dht in the graph and proactively lists and caches
entries when an opendir is performed. dht_rmdir checks if
the directory being deleted contains stale linkto files by
performing a readdirp on its child subvols. However, as
the entries are actually read in during the opendir operation
which does not request the linkto xattr,no linkto xattrs are
present for the entries causing dht to incorrectly identify
them as data files and fail the rmdir operation with ENOTEMPTY.
DHT now always adds the linkto xattr in the list of xattrs
requested in the opendir.
Change-Id: I0711198e66c59146282eb8b88084170bedfb4018
fixes: bz#1679004
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loc_wipe is done in the _out_ section, inode_unref(loc.parent) here
casues a double extra unref of loc.parent.
> Change-Id: I2dc809328d3d34bf7b02c7df9a4f97788af511e6
> Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
(cherry-pick of https://review.gluster.org/#/c/glusterfs/+/21998/)
Change-Id: I2dc809328d3d34bf7b02c7df9a4f97788af511e6
updates: bz#1679275
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two issues were found:
1. in wb_readdirp_cbk, inode should unrefed after wb_inode is
unlocked. Otherwise, inode and hence the context wb_inode can be freed
by the type we try to unlock wb_inode
2. wb_readdirp_mark_end iterates over a list of wb_inodes of children
of a directory. But inodes could've been freed and hence the list
might be corrupted. To fix take a reference on inode before adding it
to invalidate_list of parent.
Change-Id: I911b0e0b2060f7f41ded0b05db11af6f9b7c09c5
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Updates: bz#1678570
(cherry picked from commit 64cc4458918e8c8bfdeb114da0a6501b2b98491a)
|
|
|
|
|
|
|
| |
Change-Id: I7be9a5f48dcad1b136c479c58b1dca1e0488166d
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Fixes: bz#1678570
(cherry picked from commit 6175cb10cd5f59f3c7ae4100bc78f359b68ca3e9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race between the lookup selfheal and rmdir can cause
directories to be healed only on non-hashed subvols.
This can prevent the directory from being listed from
the mount point and in turn causes rm -rf to fail with
ENOTEMPTY.
Fix: Update the layout information correctly and reduce
the call count only after processing the response.
Change-Id: I812779aaf3d7bcf24aab1cb158cb6ed50d212451
fixes: bz#1677260
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Explicit invalidation by calling inode_invalidate is necessary when
same (meta)data is shared/access across multiple mounts. Without an
explicit inode_invalidate call, caches in the mount which didn't
witness writes wouldn't be aware of changes as writes wouldn't have
passed through them. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, explicit inode
invalidation can be disabled for this case which gives a huge
performance boost for workloads that write data and then immediately
read the data they just wrote. Note that otherwise, local writes
(which pass through the cache) will change ctime and cause unnecessary
invalidations.
The name of the option that controls this behavior is
"performance.global-cache-invalidation". This option is global and it
purges caches both in glusterfs and kernel stack for native FUSE
mounts. For non-native FUSE mounts, it purges cache only from
glusterfs stack. This option is effective only when
performance.stat-prefetch is on.
Note that there is a similar option "performance.cache-invalidation",
but the scope of that option is limited to quick-read and md-cache.
Change-Id: I462bb4b65ff9aae1f6ba76f50b1f2f94fb10323b
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1674364
(cherry picked from commit 2b5aa4489de2017a03bcb6ec8986286f0c76a670)
|
|
|
|
|
|
|
|
|
|
|
|
| |
When "auto-invalidation" option was not specified for mount script,
glusterfs cmdline ended with "--auto-invalidation=" option. This patch
fixes that bug in mount script.
Thanks to Amar for reporting it.
Change-Id: Ie5cd4c6ffb3ac644d9d2b032035f914a935d05a8
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1674364
|
|
|
|
|
|
|
|
|
|
| |
glusterd_resolve_all_bricks failure log should highlight the brick
identifier.
Change-Id: I035b4650ef6a14bb1e1221d3bad1c40f9d43dbdd
fixes: bz#1673972
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 12af2067a82e37079e76723d3e25ba1c72ca078a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: get-state command will error out, if any of the underlying
brick(s) of volume(s) in the cluster go bad.
It is expected that get-state command should not error out, but
should generate an output successfully.
Solution: In glusterd_get_state(), a statfs call is made on the
brick path for every bricks of the volumes to calculate the total
and free memory available. If any of statfs call fails on any
brick, we should not error out and should report total memory and free
memory of that brick as 0.
This patch also handles a statfs failure scenario in
glusterd_store_retrieve_bricks().
fixes: bz#1672205
Change-Id: Ia9e8a1d8843b65949d72fd6809bd21d39b31ad83
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scenarios tested:
* Upgrade the node when there are stripe / tiering and regular
type of volumes are present.
- All volumes are started fine (as the change was not on brick volfile)
- For tier, the functionality may not even work, as changetimerecorder
is not present.
- 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and
tier type of volume.
* Upgrade in a rolling upgrade scenario, where an old version is
able to connect to higher master.
- on a normal volume, if the volfile-server was new, the newer client
volfiles needed to have utime xlator conditionally.
- with this one change, all other changes seem to work fine.
Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8
updates: bz#1635688
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fuse sets a random gfid-req value for a fresh lookup. Posix
lookup will set this gfid on entries with missing gfids causing
a GFID mismatch for directories.
DHT will now ignore the Fuse provided gfid-req and use the GFID
returned from other subvols to heal the missing gfid.
Change-Id: I5f541978808f246ba4542564251e341ec490db14
fixes: bz#1670259
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Auto invalidation is necessary when same (meta)data is shared/access
across multiple mounts. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, fuse-auto-invalidation
can be disabled for this case which gives a huge performance boost for
workloads that write data and then immediately read the data they just
wrote.
From glusterfs --help,
<snip>
--auto-invalidation[=BOOL] controls whether fuse-kernel can
auto-invalidate attribute, dentry and page-cache.
Disable this only if same files/directories are
not accessed across two different mounts
concurrently [default: "on"]
</snip>
Details on how disabling auto-invalidation helped to reduce pgbench
init times can be found at [1]. Time taken for pgbench init of scale
8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s)
with auto-invalidations turned off along with other
optimizations. Just disabling auto-invalidation contributed 56%
improvement by reducing the total time taken by 33260s.
[1] https://www.spinics.net/lists/gluster-devel/msg25907.html
Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch creates a specific function to set the thread name using a
string format and a variable argument list, like printf().
This function is used to set the thread name from gf_thread_create(),
which now accepts a variable argument list to create the full name. It's
not necessary anymore to use a local array to build the name of the
thread. This is done automatically.
Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c
Updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.
As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.
Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rebalance sets the sgid and t bits on a file
that is being migrated. These permissions are
not removed in dht_readdirp_cbk when listing files
causing them to show up on the mountpoint.
We now remove these permissions if a non-linkto
file has the linkto xattr set.
Change-Id: I5c69b2ecfe2df804fe50faea903b242d01729596
fixes: bz#1669937
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Avoid thread creation for bitrot-stub
for a volume if feature is not enabled
Solution: Before thread creation check the flag if feature
is enabled
Updates: #475
Change-Id: I2c6cc35bba142d4b418cc986ada588e558512c8e
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk()
is supposed to call rpc_transport_unref() after open-files on
that transport are flushed per transport.But open-fd-count is
maintained in bound_xl->fd_count, which can be incremented/decremented
cumulatively in server_connection_cleanup() by all transport
disconnect paths. So instead of rpc_transport_unref() happening
per transport, it ends up doing it only once after all the files
on all the transports for the brick are flushed leading to
rpc-leaks.
Solution: To avoid races maintain fd_cnt at client instead of maintaining
on brick
Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...when ctime is zero. ia_type and ia_gfid always need to be non-zero
for things to work correctly.
Problem:
Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt
buffer in the cbks of modification fops before unwinding if the ctime in
the buffer was zero. This was causing the fops to fail: noticeable when
AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime
when the option is set. See commit
4c4624c9bad2edf27128cb122c64f15d7d63bbc8).
Fixes:
-Do not zero out the ia_type and ia_gfid of the iatt buff under any
circumstance.
-Also, fixed _rda_inode_ctx_update_iatts() to always update these values from
the incoming buf when ctime is zero. Otherwise we end up with zero
ia_type and ia_gfid the first time the function is called *and* the
incoming buf has ctime set to zero.
fixes: bz#1670253
Reported-By:Michael Hanselmann <public@hansmi.ch>
Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the feature enabled, some of the performance testing results,
specially those which create millions of small files, got approximately
4x regression compared to version before enabling this.
On master without this patch: 765 creates/sec
On master with this patch : 3380 creates/sec
Also there seems to be regression caused by this in 'ls -l' workload.
On master without this patch: 3030 files/sec
On master with this patch : 16610 files/sec
This is a feature added to handle multiple clients parallely operating
(specially those which race for file creates with same name) on a single
namespace/directory. Considering that is < 3% of Gluster's usecase right
now, it makes sense to disable the feature by default, so we don't
penalize the default users who doesn't bother about this usecase.
Also note that the client side translators, specially, distribute,
replicate and disperse already handle the issue upto 99.5% of the cases
without SDFS, so it makes sense to keep the feature disabled by default.
Credits: Shyamsunder <srangana@redhat.com> for running the tests and
getting the numbers.
Change-Id: Iec49ce1d82e621e9db25eb633fcb1d932e74f4fc
Updates: bz#1670031
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly, unlock before logging.
In some cases, moved different code that was not needed
to be under lock (for example, taking time, or malloc'ing)
to be executed before taking the lock.
Note: logging might be slightly less accurate in order, since it may
not be done now under the lock, so order of logs is racy. I think
it's a reasonable compromise.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
Lot of the earlier changes in the management of shards in lru, fsync
lists assumed that if a given shard exists in fsync list, it must be
part of lru list as well. This was found to be not true.
Consider this - a file is FALLOCATE'd to a size which would make the
number of participant shards to be greater than the lru list size.
In this case, some of the resolved shards that are to participate in
this fop will be evicted from lru list to give way to the rest of the
shards. And once FALLOCATE completes, these shards are added to fsync
list but without a ref. After the fop completes, these shard inodes
are unref'd and destroyed while their inode ctxs are still part of
fsync list. Now when an FSYNC is called on the base file and the
fsync-list traversed, the client crashes due to illegal memory access.
FIX:
Hold a ref on the shard inode when adding to fsync list as well.
And unref under following conditions:
1. when the shard is evicted from lru list
2. when the base file is fsync'd
3. when the shards are deleted.
Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2
fixes: bz#1669077
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
gf_dirent struct has d_type variable which should check
with DT_DIR istead of IA_IFDIR or IA_IFDIR has to compare
with entry->d_stat.ia_type
Change-Id: Idf1059ce2a590734bc5b6adaad73604d9a708804
updates: bz#1653359
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of deleting block hosting volume
through heketi-cli , it is throwing an error "target is busy".
cli is throwing an error because brick is not detached successfully
and brick is not detached due to race condition to cleanp xprt
associated with detached brick
Solution: To avoid xprt specifc race condition introduce an atomic flag
on rpc_transport
Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a lock contention in reaadir-ahead xlator.
There are two issues, one is the processing of "." ".."
entry while holding an fd_ctx lock. The other one is destroying
the stack inside a fd_ctx lock.
Change-Id: Id0bf83a3d9fea6b40015b8d167525c59c6cfa25e
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|