summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* doc: Added release notes for 5.13v5.13release-5Hari Gowtham2020-04-071-0/+35
| | | | | | Change-Id: Id168b675f8afae2ba2e5c8ac5b431aa14c832b54 fixes: #1147 Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* open-behind: fix missing fd referenceXavi Hernandez2020-04-061-11/+16
| | | | | | | | | | | Open behind was not keeping any reference on fd's pending to be opened. This makes it possible that a concurrent close and en entry fop (unlink, rename, ...) caused destruction of the fd while it was still being used. Change-Id: Ie9e992902cf2cd7be4af1f8b4e57af9bd6afd8e9 Fixes: #1028 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* features/shard: Fix crash during shards cleanup in error casesKrutika Dhananjay2020-04-061-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | A crash is seen during a reattempt to clean up shards in background upon remount. And this happens even on remount (which means a remount is no workaround for the crash). In such a situation, the in-memory base inode object will not be existent (new process, non-existent base shard). So local->resolver_base_inode will be NULL. In the event of an error (in this case, of space running out), the process would crash at the time of logging the error in the following line - gf_msg(this->name, GF_LOG_ERROR, local->op_errno, SHARD_MSG_FOP_FAILED, "failed to delete shards of %s", uuid_utoa(local->resolver_base_inode->gfid)); Fixed that by using local->base_gfid as the source of gfid when local->resolver_base_inode is NULL. Change-Id: I0b49f2b58becd0d8874b3d4b14ff8d92a89d02d5 Fixes: #1127 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit cc43ac8651de9aa508b01cb259b43c02d89b2afc)
* cluster/afr: fix race when bricks come upXavi Hernandez2020-04-063-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The was a problem when self-heal was sending lookups at the same time that one of the bricks was coming up. In this case there was a chance that the number of 'up' bricks changes in the middle of sending the requests to subvolumes which caused a discrepancy in the expected number of replies and the actual number of sent requests. This discrepancy caused that AFR continued executing requests before all requests were complete. Eventually, the frame of the pending request was destroyed when the operation terminated, causing a use- after-free issue when the answer was finally received. In theory the same thing could happen in the reverse way, i.e. AFR tries to wait for more replies than sent requests, causing a hang. Backport of: > Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> > Fixes: bz#1808875 Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1809440
* afr: prevent spurious entry heals leading to gfid split-brainRavishankar N2020-04-067-29/+69
| | | | | | | | | | | | | | | | | | | | | Problem: In a hyperconverged setup with granular-entry-heal enabled, if a file is recreated while one of the bricks is down, and an index heal is triggered (with the brick still down), entry-self heal was doing a spurious heal with just the 2 good bricks. It was doing a post-op leading to removal of the filename from .glusterfs/indices/entry-changes as well as erroneous setting of afr xattrs on the parent. When the brick came up, the xattrs were cleared, resulting in the renamed file not getting healed and leading to gfid split-brain and EIO on the mount. Fix: Proceed with entry heal only when shd can connect to all bricks of the replica, just like in data and metadata heal. fixes: #1103 Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 06453d77d056fbaa393a137ca277a20e38d2f67e)
* afr: mark pending xattrs as a part of metadata healRavishankar N2020-04-022-1/+120
| | | | | | | | | | | | | | | | | | | | ...if pending xattrs are zero for all children. Problem: If there are no pending xattrs and a metadata heal needs to be performed, it can be possible that we end up with xattrs inadvertendly deleted from all bricks, as explained in the BZ. Fix: After picking one among the sources as the good copy, mark pending xattrs on all sources to blame the sinks. Now even if this metadata heal fails midway, a subsequent heal will still choose one of the valid sources that it picked previously. Updates: #1067 Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 2d5ba449e9200b16184b1e7fc84cabd015f1f779)
* Cluster/afr: Don't treat all bricks having metadata pending as split-brainkarthik-us2020-03-024-67/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in metadata split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others for metadata. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks. Fix: Pick a source based on the xattr values. If 2 bricks blame one, the blamed one must be treated as sink. If there is no majority, all are sources. Once we pick a source, self-heal will then do the heal instead of erroring out due to split-brain. This patch also adds restriction of all the bricks to be up to perform metadata heal to avoid any metadata loss. Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t as it was doing metadata heal even when only 2 of 3 bricks were up. Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09 fixes: bz#1806931 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* rpc: Update address family if it is not provide in cmd-line argumentsMohit Agrawal2020-03-021-1/+12
| | | | | | | | | | | | | | | | | | | | Problem: After enabling transport-type to inet6 and passed ipv6 transport.socket.bind-address in glusterd.vol clients are not started. Solution: Need to update address-family based on remote-address for all gluster client process > Change-Id: Iaa3588cd87cebc45231bfd675745c1a457dc9b31 > Fixes: bz#1747746 > Credits: Amgad Saleh <amgad.saleh@nokia.com> > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit 80b8cfe3f1386606bada97a76a0cad7acdf6b877) Change-Id: Iaa3588cd87cebc45231bfd675745c1a457dc9b31 Fixes: bz#1807007 Credits: Amgad Saleh <amgad.saleh@nokia.com> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* protocol/client: Do not fallback to anon-fd if fd is not openPranith Kumar K2020-03-022-1/+43
| | | | | | | | | | | | | | | | | | | | | | If an open comes on a file when a brick is down and after the brick comes up, a fop comes on the fd, client xlator would still wind the fop on anon-fd leading to wrong behavior of the fops in some cases. Example: If lk fop is issued on the fd just after the brick is up in the scenario above, lk fop will be sent on anon-fd instead of failing it on that client xlator. This lock will never be freed upon close of the fd as flush on anon-fd is invalid and is not wound below server xlator. As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag. > Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc > fixes bz#1390914 > Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc fixes bz#1808256 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* afr: wake up index healer threadsRavishankar N2020-02-276-11/+67
| | | | | | | | | | | | | | | Backport of https://review.gluster.org/#/c/glusterfs/+/23288/ ...whenever shd is re-enabled after disabling or there is a change in `cluster.heal-timeout`, without needing to restart shd or waiting for the current `cluster.heal-timeout` seconds to expire. See BZ 1743988 for more details. Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe fixes: bz#1807431 Reported-by: Glen Kiessling <glenk1973@hotmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tests: fix spurious failure of bug-1402841.t-mt-dir-scan-race.tRavishankar N2020-02-271-4/+5
| | | | | | | | | | | | | | | | | | | | | | | Problem: Since commit 600ba94183333c4af9b4a09616690994fd528478, shd starts healing as soon as it is toggled from disabled to enabled. This was causing the following line in the .t to fail on a 'fast' machine (always on my laptop and sometimes on the jenkins slaves). EXPECT_NOT "^0$" get_pending_heal_count $V0 because by the time shd was disabled, the heal was already completed. Fix: Increase the no. of files to be healed and make it a variable called FILE_COUNT, should we need to bump it up further because the machines become even faster. Also created pending metadata heals to increase the time taken to heal a file. fixes: bz#1807748 Change-Id: I5a26b08e45b8c19bce3c01ce67bdcc28ed48198d Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 724c657995a2e148243eeb78c68b620c6d7714a5)
* doc: Added release 5.12 notesv5.12hari gowtham2020-02-261-0/+34
| | | | | | fixes: bz#1806848 Change-Id: I3de9fa0887b1d68dfd32595e290dabfcc65c5311 Signed-off-by: hari gowtham <hgowtham@redhat.com>
* cluster/ec: Change handling of heal failure to avoid crashAshish Pandey2020-02-252-13/+13
| | | | | | | | | | | | | | | | | Problem: ec_getxattr_heal_cbk was called with NULL as second argument in case heal was failing. This function was dereferencing "cookie" argument which caused crash. Solution: Cookie is changed to carry the value that was supposed to be stored in fop->data, so even in the case when fop is NULL in error case, there won't be any NULL dereference. Thanks to Xavi for the suggestion about the fix. Change-Id: I0798000d5cadb17c3c2fbfa1baf77033ffc2bb8c fixes: bz#1805057
* cluster/ec: Update lock->good_mask on parent fop failurePranith Kumar K2020-02-252-0/+4
| | | | | | | | | | When discard/truncate performs write fop, it should do so after updating lock->good_mask to make sure readv happens on the correct mask fixes: bz#1805056 Change-Id: Idfef0bbcca8860d53707094722e6ba3f81c583b7 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Fix reopen flags to avoid misbehaviorPranith Kumar K2020-02-252-3/+8
| | | | | | | | | | | | | | | | | | | | | | | Problem: when a file needs to be re-opened O_APPEND and O_EXCL flags are not filtered in EC. - O_APPEND should be filtered because EC doesn't send O_APPEND below EC for open to make sure writes happen on the individual fragments instead of at the end of the file. - O_EXCL should be filtered because shd could have created the file so even when file exists open should succeed - O_CREAT should be filtered because open happens with gfid as parameter. So open fop will create just the gfid which will lead to problems. Fix: Filter out these two flags in reopen. Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4 Fixes: bz#1805055 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Always read from good-maskPranith Kumar K2020-02-252-5/+25
| | | | | | | | | | There are cases where fop->mask may have fop->healing added and readv shouldn't be wound on fop->healing. To avoid this always wind readv to lock->good_mask fixes: bz#1805054 Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: fix EIO error for concurrent writes on sparse filesXavi Hernandez2020-02-252-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC doesn't allow concurrent writes on overlapping areas, they are serialized. However non-overlapping writes are serviced in parallel. When a write is not aligned, EC first needs to read the entire chunk from disk, apply the modified fragment and write it again. The problem appears on sparse files because a write to an offset implicitly creates data on offsets below it (so, in some way, they are overlapping). For example, if a file is empty and we read 10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte at offset 1M and retry the same read, the system call will return 10 bytes (all containing 0's). So if we have two writes, the first one at offset 10 and the second one at offset 1M, EC will send both in parallel because they do not overlap. However, the first one will try to read missing data from the first chunk (i.e. offsets 0 to 9) to recombine the entire chunk and do the final write. This read will happen in parallel with the write to 1M. What could happen is that half of the bricks process the write before the read, and the half do the read before the write. Some bricks will return 10 bytes of data while the otherw will return 0 bytes (because the file on the brick has not been expanded yet). When EC tries to recombine the answers from the bricks, it can't, because it needs more than half consistent answers to recover the data. So this read fails with EIO error. This error is propagated to the parent write, which is aborted and EIO is returned to the application. The issue happened because EC assumed that a write to a given offset implies that offsets below it exist. This fix prevents the read of the chunk from bricks if the current size of the file is smaller than the read chunk offset. This size is correctly tracked, so this fixes the issue. Also modifying ec-stripe.t file for Test #13 within it. In this patch, if a file size is less than the offset we are writing, we fill zeros in head and tail and do not consider it strip cache miss. That actually make sense as we know what data that part holds and there is no need of reading it from bricks. Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 Fixes: bz#1805053 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* cluster/ec: skip updating ctx->loc again when ec_fix_open/opendirKinglong Mee2020-02-252-10/+14
| | | | | | | | | | | | | The ec_manager_open/opendir memsets ctx->loc which causes memory/inode leak, and ec_fheal uses ctx->loc out of fd->lock that loc_copy may copy bad data when memset it. This patch skips updating ctx->loc when it is initilizaed. With it, ctx->loc is filled once, and never updated. Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a fixes: bz#1805052 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
* cluster/ec: inherit healing from lock when it has infoKinglong Mee2020-02-251-2/+3
| | | | | | | | | If lock has info, fop should inherit healing mask from it. Otherwise, fop cannot inherit right healing when changed_flags is zero. Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f fixes: bz#1805051 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
* cluster/dht: Correct fd processing loopN Balachandran2020-02-251-22/+62
| | | | | | | | | | | | | | backport of https://review.gluster.org/#/c/glusterfs/+/23506/ The fd processing loops in the dht_migration_complete_check_task and the dht_rebalance_inprogress_task functions were unsafe and could cause an open to be sent on an already freed fd. This has been fixed. Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540 Fixes: bz#1804522 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/ec: Prevent double pre-op xattropsPranith Kumar K2020-02-242-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Race: Thread-1 Thread-2 1) Does ec_get_size_version() to perform pre-op fxattrop as part of write-1 2) Calls ec_set_dirty_flag() in ec_get_size_version() for write-2. This sets dirty[] to 1 3) Completes executing ec_prepare_update_cbk leading to ctx->dirty[] = '1' 4) Takes LOCK(inode->lock) to check if there are any flags and sets dirty-flag because lock->waiting_flag is 0 now. This leads to fxattrop to increment on-disk dirty[] to '2' At the end of the writes the file will be marked for heal even when it doesn't need heal. Fix: Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked as '1' in step 2) above Fixes: bz#1805050 Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Reopen shouldn't happen with O_TRUNCPranith Kumar K2020-02-241-1/+1
| | | | | | | | | | | | | Problem: Doing re-open with O_TRUNC will truncate the fragment even when it is not needed needing extra heals Fix: At the time of re-open don't use O_TRUNC. fixes: bz#1805049 Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* server: Mount fails after reboot 1/3 gluster nodesMohit Agrawal2020-02-243-17/+29
| | | | | | | | | | | | | | | | | | | | | | Problem: At the time of coming up one server node(1x3) after reboot client is unmounted.The client is unmounted because a client is getting AUTH_FAILED event and client call fini for the graph.The client is getting AUTH_FAILED because brick is not attached with a graph at that moment Solution: To avoid the unmounting the client graph throw ENOENT error from server in case if brick is not attached with server at the time of authenticate clients. > Credits: Xavi Hernandez <xhernandez@redhat.com> > Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e > Fixes: bz#1793852 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (cherry picked from commit f6421dff22a6ddaf14134f6894deae219948c89d) Fixes: bz#1804512 Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* cluster/ec: fix fd reopenXavi Hernandez2020-02-2016-274/+431
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently EC tries to reopen fd's that have been opened while a brick was down. This is done as part of regular write operations, just after having acquired the locks, and it's sent as a sub-fop of the main write fop. There were two problems: 1. The reopen was attempted on all UP bricks, even if a previous lock didn't succeed. This is incorrect because most probably the open will fail. 2. If reopen is sent and fails, the error is propagated to the main operation, causing it to fail when it shouldn't. To fix this, we only attempt reopens on bricks where the current fop owns a lock, and we prevent any error to be propagated to the main fop. To implement this behaviour an argument used to indicate the minimum number of required answers has overloaded to also include some flags. To make the change consistent, it has been necessary to rename the argument, which means that a lot of files have been changed. However there are no functional changes. This change has also uncovered a problem in discard code, which didn't correctely process requests of small sizes because no real discard fop was being processed, only a write of 0's on some region. In this case some fields of the fop remained uninitialized or with incorrect values. To fix this, a new function has been created to simulate success on a fop and it's used in the discard case. Thanks to Pranith for providing a test script that has also detected an issue in this patch. This patch includes a small modification of this script to force data to be written into bricks before stopping them. Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec Fixes: bz#1805047 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* extras: enable log rotation for USS logsSunny Kumar2020-02-171-0/+21
| | | | | | | | | | | | | | Added logrotate support for user serviceable snapshot's logs. Backport of: >Upstream patch: https://review.gluster.org/#/c/glusterfs/+/23930/ >Change-Id: Ic920eaa8ab5e44daf5937a027c6913d7bb26d517 >Fixes: bz#1786722 >Signed-off-by: Sunny Kumar <sunkumar@redhat.com> >(cherry picked from commit f2a86ddf9331cd7387c44ba03307adac43575fc4) Change-Id: Ic920eaa8ab5e44daf5937a027c6913d7bb26d517 Fixes: bz#1803810 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* doc: Added release 5.11 notesv5.11Hari Gowtham2019-12-091-0/+23
| | | | | | | Fixes: bz#1762175 Change-Id: Ifaf8eb242d98b96ce83e5669b5fb08a18b7dee6a Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* inode: fix wrong loop count in __inode_ctx_freeXie Changlong2019-12-061-5/+6
| | | | | | | | | Avoid serious memory leak fixes: bz#1779284 Change-Id: Ic61a8fdd0e941e136c98376a87b5a77fa8c22316 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> (cherry picked from commit 93156e203c2f51d8a4047889a4c769564b8107b1)
* system/posix-acl: update ctx only if iatt is non-NULLHomma2019-12-061-0/+8
| | | | | | | | | | | | | | | We need to safe-guard against possible zero'ing out of iatt structure in acl ctx, which can cause many issues. Backport of: > fixes: bz#1668286 > Change-Id: Ie81a57d7453a6624078de3be8c0845bf4d432773 > Signed-off-by: Amar Tumballi <amarts@redhat.com> (cherry-picked from commit 6bf9637a93011298d032332ca93009ba4e377e46) fixes: bz#1767305 Change-Id: Ie81a57d7453a6624078de3be8c0845bf4d432773 Signed-off-by: Alan Orth <alan.orth@gmail.com>
* Detach iot_worker to release its resourcesLiguang Li2019-11-051-0/+1
| | | | | | | | | | | | | When iot_worker terminates, its resources have not been reaped, which will consumes lots of memory. Detach iot_worker to automically release its resources back to the system. Change-Id: I71fabb2940e76ad54dc56b4c41aeeead2644b8bb fixes: bz#1718734 Signed-off-by: Liguang Li <liguang.lee6@gmail.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/afr: Heal entries when there is a source & no healed_sinkskarthik-us2019-10-112-0/+104
| | | | | | | | | | | | | | | | | | | | Problem: In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame anything for entry heal, heal will not complete even though we have clear source and sinks. This will happen because while doing afr_selfheal_find_direction() only the bricks which are blamed by non-accused bricks are considered as sinks. Later in __afr_selfheal_entry_finalize_source() when it tries to mark all the non-sources as sinks it fails to do so because there won't be any healed_sinks marked, no witness present and there will be a source. Fix: If there is a source and no healed_sinks, then reset all the locked sources to 0 and healed sinks to 1 to do conservative merge. Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548 Fixes: bz#1760710 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* doc: Added release 5.10 notesv5.10Hari Gowtham2019-10-091-0/+23
| | | | | | | Fixes: bz#1759832 Change-Id: I9c38181ec50d9b18ebc8bb8e1290f20a006be3bc Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* geo-rep : fix mountbroker setupSunny Kumar2019-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Unable to setup mountbroker root directory while creating geo-replication session for non-root user. Casue: With patch[1] which defines the max-port for glusterd one extra sapce got added in field of 'option max-port'. [1]. https://review.gluster.org/#/c/glusterfs/+/21872/ In geo-rep spliting of key-value pair form vol file was done on the basis of space so this additional space caused "ValueError: too many values to unpack". Solution: Use split so that it can treat consecutive whitespace as a single separator. Backport of: https://review.gluster.org/#/c/glusterfs/+/22716/. >Fixes: bz#1709248 >Change-Id: Ia22070a43f95d66d84cb35487f23f9ee58b68c73 >Signed-off-by: Sunny Kumar <sunkumar@redhat.com> >(cherry picked from commit 3dd03146bb7037ae2ebea0579d0b81be27fdd927) Change-Id: Ia22070a43f95d66d84cb35487f23f9ee58b68c73 Fixes: bz#1750230 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* afr/lookup: Pass xattr_req in while doing a selfheal in lookupMohammed Rafi KC2019-09-055-5/+70
| | | | | | | | | | | | | | | We were not passing xattr_req when doing a name self heal as well as a meta data heal. Because of this, some xdata was missing which causes i/o errors Backport of>https://review.gluster.org/#/c/glusterfs/+/23024/ >Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f >Fixes: bz#1728770 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: I740ccdbfcf2667fe4a850c12ecdc4b9eeed08293 Fixes: bz#1749352 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* protocol/client: propagte GF_EVENT_CHILD_PING only for connections to brickRaghavendra G2019-08-091-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Two reasons: * ping responses from glusterd may not be relevant for Halo replication. Instead, it might be interested in only knowing whether the brick itself is responsive. * When a brick is killed, propagating GF_EVENT_CHILD_PING of ping response from glusterd results in GF_EVENT_DISCONNECT spuriously propagated to parent xlators. These DISCONNECT events are from the connections client establishes with glusterd as part of its reconnect logic. Without GF_EVENT_CHILD_PING, the last event propagated to parent xlators would be the first DISCONNECT event from brick and hence subsequent DISCONNECTS to glusterd are not propagated as protocol/client prevents same event being propagated to parent xlators consecutively. propagating GF_EVENT_CHILD_PING for ping responses from glusterd would change the last_sent_event to GF_EVENT_CHILD_PING and hence protocol/client cannot prevent subsequent DISCONNECT events Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1739336 Change-Id: I50276680c52f05ca9e12149a3094923622d6eaef (cherry picked from commit 5d66eafec581fb3209af74595784be8854ca40a4)
* doc: Added release 5.9 notesv5.9hari gowtham2019-08-061-0/+23
| | | | | | | Fixes: bz#1737313 Change-Id: Icc44f0d1b33de628ba5c1b0672ed56791d281dcf Signed-off-by: hari gowtham <hgowtham@redhat.com>
* geo-rep: Fix mount broker setup issueKotresh HR2019-08-061-3/+7
| | | | | | | | | | | | | | | | | | | | Even the use builtin 'type' command as in patch [1] causes issues if argument in question is not part of PATH environment variable for that user. This patch fixes the same by doing source /etc/profile. This was already being used in another part of script. [1] https://review.gluster.org/23089 Backport of: > Patch: https://review.gluster.org/23136/ > Change-Id: Iceb78835967ec6a4350983eec9af28398410c002 > BUG: 1734738 > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: Iceb78835967ec6a4350983eec9af28398410c002 fixes: bz#1737716 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* gfapi: Fix deadlock while processing upcallSoumya Koduri2019-08-021-33/+131
| | | | | | | | | | | | | | | | | | | As mentioned in bug1733166, there could be potential deadlock while processing upcalls depending on how each xlator choose to act on it. The right way of fixing such issues is to change rpc callback communication process. - https://github.com/gluster/glusterfs/issues/697 Till then, making changes in gfapi layer to avoid any I/O processing. This is backport of below mainline patch > https://review.gluster.org/#/c/glusterfs/+/23108/ > bz#1733166 Change-Id: I2079e95339e5d761d5060707f4555cfacab95c83 fixes: bz#1736342 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* geo-rep: Fix mount broker setup issueKotresh HR2019-07-291-3/+3
| | | | | | | | | | | | | | | | | | | | | The patch [1] added validation in gverify.sh to check if the gluster binary exists on slave by executing gluster directly on slave. But for non-root users, even though gluster binary is present, path is not found when executed via ssh. Hence validate the gluster binary using bash builtin 'type' command. [1] https://review.gluster.org/19224 Backport of: > Patch: https://review.gluster.org/23089/ > Change-Id: I93ca62c0c5b1e16263e586ddbbca8407d60ca126 > BUG: 1731920 > Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 2aa731a259ea457c07494e3c3edf6d5f7c02fe77) Change-Id: I93ca62c0c5b1e16263e586ddbbca8407d60ca126 fixes: bz#1733881 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* doc: Added release 5.8 notesv5.8hari gowtham2019-07-181-4/+10
| | | | | | | Fixes: bz#1731014 Change-Id: I4b8c079865a0193f0a3c4665230ca6c50a223dea Signed-off-by: hari gowtham <hgowtham@redhat.com>
* glusterd: add GF_TRANSPORT_BOTH_TCP_RDMA in glusterd_get_gfproxy_client_volfileAtin Mukherjee2019-07-182-1/+4
| | | | | | | | | | | | | ... with out which volume creation fails with "volume create: <xyz>: failed: Failed to create volume files" >Fixes: bz#1716812 >Change-Id: I2f4c2c6d5290f066b54e1c1db19e25db9937bedb >Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Fixes: bz#1721106 Change-Id: I2f4c2c6d5290f066b54e1c1db19e25db9937bedb Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Configure changes to build with python2hari gowtham2019-07-181-6/+6
| | | | | | | | | | | | | | | | | | Issue: On a centos7 machine with python3, glupy doesnt get built. Throws the following: configure: WARNING: --------------------------------------------------------------------------------- cannot build glupy. python 3.6 and python-devel/python-dev package are required. --------------------------------------------------------------------------------- Cause: glupy needs python3 with python3-devel or python2 should be used. Fix: We are making the configure prioritize python2 over python3. Change-Id: I6d4ebc6d181c48cb1ccfd19a9a1793fe2cae3f27 fixes: bz#1728988 Signed-off-by: hari gowtham <hgowtham@redhat.com>
* gfapi: fix incorrect initialization of upcall syncop argumentsSoumya Koduri2019-07-051-37/+72
| | | | | | | | | | | | | | | | While sending upcall notifications via synctasks, the argument used to carry relevant data for these tasks is not initialized properly. This patch is to fix the same. This is backport of below mainline fix - > fixes: bz#1718316 > patch url: https://review.gluster.org/#/c/glusterfs/+/22839/ > Signed-off-by: Soumya Koduri <skoduri@redhat.com> Change-Id: I9fa8f841e71d3c37d3819fbd430382928c07176c fixes: bz#1720636 Signed-off-by: Soumya Koduri <skoduri@redhat.com> (cherry picked from commit bc6fd4cfa6ed34de3ffc02e2279fcc713f80f530)
* upcall: Avoid sending notifications for invalid inodesSoumya Koduri2019-07-041-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | For nameless LOOKUPs, server creates a new inode which shall remain invalid until the fop is successfully processed post which it is linked to the inode table. But incase if there is an already linked inode for that entry, it discards that newly created inode which results in upcall notification. This may result in client being bombarded with unnecessary upcalls affecting performance if the data set is huge. This issue can be avoided by looking up and storing the upcall context in the original linked inode (if exists), thus saving up on those extra callbacks. This is backport of below mainline fix - > fixes: bz#1718338 > patch url: https://review.gluster.org/#/c/glusterfs/+/22840/ > Signed-off-by: Soumya Koduri <skoduri@redhat.com> Change-Id: I044a1737819bb40d1a049d2f53c0566e746d2a17 fixes: bz#1720634 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* doc: Added release notes for 5.7v5.7hari gowtham2019-07-021-0/+26
| | | | | | | Fixes: bz#1697986 Change-Id: I6dc17424665431957152761eaec7b6b2226ae1f0 Signed-off-by: hari gowtham <hgowtham@redhat.com>
* cluster/ec: honor contention notifications for partially acquired locksXavi Hernandez2019-06-282-1/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | EC was ignoring lock contention notifications received while a lock was being acquired. When a lock is partially acquired (some bricks have granted the lock but some others not yet) we can receive notifications from acquired bricks, which should be honored, since we may not receive more notifications after that. Since EC was ignoring them, once the lock was acquired, it was not released until the eager-lock timeout, causing unnecessary delays on other clients. This fix takes into consideration the notifications received before having completed the full lock acquisition. After that, the lock will be releaed as soon as possible. Backport of: > BUG: bz#1708156 > Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1717282 Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* tests/utils: Fix py2/py3 util python scriptsKotresh HR2019-06-288-30/+261
| | | | | | | | | | | | | | | | | | | | | Following files are fixed. tests/bugs/distribute/overlap.py tests/utils/changelogparser.py tests/utils/create-files.py tests/utils/gfid-access.py tests/utils/libcxattr.py Have marked glupy as bad test. Backport of: > Change-Id: I3db857cc19e19163d368d913eaec1269fbc37140 > BUG: 1193929 > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: I3db857cc19e19163d368d913eaec1269fbc37140 Updates: bz#1629877 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ec: fix truncate lock to cover the write in tuncate cleanKinglong Mee2019-05-081-2/+6
| | | | | | | | | | | ec_truncate_clean does writing under the lock granted for truncate, but the lock is calculated by ec_adjust_offset_up, so that, the write in ec_truncate_clean is out of lock. Updates: bz#1699500 Change-Id: I15ed1b0807d75c5eb817323f1c227e97d03e0e7c Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> (cherry picked from commit 0e1223491e964096384edfae5032ed0d50d028ad)
* performance/write-behind: remove request from wip list in wb_writev_cbkRaghavendra G2019-05-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a race in the way O_DIRECT writes are handled. Assume two overlapping write requests w1 and w2. * w1 is issued and is in wb_inode->wip queue as the response is still pending from bricks. Also wb_request_unref in wb_do_winds is not yet invoked. list_for_each_entry_safe (req, tmp, tasks, winds) { list_del_init (&req->winds); if (req->op_ret == -1) { call_unwind_error_keep_stub (req->stub, req->op_ret, req->op_errno); } else { call_resume_keep_stub (req->stub); } wb_request_unref (req); } * w2 is issued and wb_process_queue is invoked. w2 is not picked up for winding as w1 is still in wb_inode->wip. w1 is added to todo list and wb_writev for w2 returns. * response to w1 is received and invokes wb_request_unref. Assume wb_request_unref in wb_do_winds (see point 1) is not invoked yet. Since there is one more refcount, wb_request_unref in wb_writev_cbk of w1 doesn't remove w1 from wip. * wb_process_queue is invoked as part of wb_writev_cbk of w1. But, it fails to wind w2 as w1 is still in wip. * wb_requet_unref is invoked on w1 as part of wb_do_winds. w1 is removed from all queues including w1. * After this point there is no invocation of wb_process_queue unless new request is issued from application causing w2 to be hung till the next request. This bug is similar to bz 1626780 and bz 1379655. Change-Id: Iaa47437613591699d4c8ad18bc0b32de6affcc31 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1707198 (cherry picked from commit 6454132342c0b549365d92bcf3572ecd914f7fa8)
* cluster/afr: Remove local from owners_list on failure of lock-acquisitionPranith Kumar K2019-05-085-18/+61
| | | | | | | | | | | | | When eager-lock lock acquisition fails because of say network failures, the local is not being removed from owners_list, this leads to accumulation of waiting frames and the application will hang because the waiting frames are under the assumption that another transaction is in the process of acquiring lock because owner-list is not empty. Handled this case as well in this patch. Added asserts to make it easier to find these problems in future. fixes bz#1699736 Change-Id: I3101393265e9827755725b1f2d94a93d8709e923 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/dht: Request linkto xattrs in dht_rmdir opendirN Balachandran2019-04-101-1/+26
| | | | | | | | | | | | | | | | | | | If parallel-readdir is enabled, the rda xlator is loaded below dht in the graph and proactively lists and caches entries when an opendir is performed. dht_rmdir checks if the directory being deleted contains stale linkto files by performing a readdirp on its child subvols. However, as the entries are actually read in during the opendir operation which does not request the linkto xattr,no linkto xattrs are present for the entries causing dht to incorrectly identify them as data files and fail the rmdir operation with ENOTEMPTY. DHT now always adds the linkto xattr in the list of xattrs requested in the opendir. Change-Id: I0711198e66c59146282eb8b88084170bedfb4018 fixes: bz#1695399 Signed-off-by: N Balachandran <nbalacha@redhat.com> (cherry picked from commit 110006bbcd5bb3e814b4cfe7d74cb41891ac3b0c)