summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* core: fix memory allocation issuesv7.0rc1Xavi Hernandez2019-09-162-27/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | Two problems have been identified that caused that gluster's memory usage were twice higher than required. 1. An off by 1 error caused that all objects allocated from the memory pools were taken from a pool bigger than required. Since each pool corresponds to a size equal to a power of two, this was wasting half of the available memory. 2. The header information used for accounting on each memory object was not taken into consideration when searching for a suitable memory pool. It was added later when each individual block was allocated. This made this space "invisible" to memory accounting. Credits: Thanks to Nithya Balachandran for identifying this problem and testing this patch. >Fixes: bz#1722802 Change-Id: I90e27ad795fe51ca11c13080f62207451f6c138c >Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> >(cherry picked from commit 1716a907da1a835b658740f1325033d7ddd44952) Fixes: bz#1748774 Change-Id: I90e27ad795fe51ca11c13080f62207451f6c138c Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* core/syncop: Bail out if frame creation failsSoumya Koduri2019-09-151-0/+6
| | | | | | | | | | | | | | There could be cases (either due to insufficient memory or corrupted mem-pool) due to which frame creation fails. Bail out with error in such cases. This is the backport of below mainline fix - > Fixes: bz#1748448 > review url: https://review.gluster.org/#/c/glusterfs/+/23350/ Change-Id: I8cc0a5852f6f04d2bac991e4eb79ecb42577da11 Fixes: bz#1751556 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* glusterd: fix use-after-free of a dict_tXavi Hernandez2019-09-151-1/+1
| | | | | | | | | | | | | | | | | | A dict was passed to a function that calls dict_unref() without taking any additional reference. Given that the same dict is also used after the function returns, this was causing a use-after-free situation. To fix the issue, we simply take an additional reference before calling the function. > Fixes: bz#1723890 > Change-Id: I98c6b76b08fe3fa6224edf281a26e9ba1ffe3017 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> > (cherry picked from commit f36086db87aae24c10abde434f081d78b942735e) Fixes: bz#1752245 Change-Id: I98c6b76b08fe3fa6224edf281a26e9ba1ffe3017 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* afr/lookup: Pass xattr_req in while doing a selfheal in lookupMohammed Rafi KC2019-09-115-5/+69
| | | | | | | | | | | | | | | | | | We were not passing xattr_req when doing a name self heal as well as a meta data heal. Because of this, some xdata was missing which causes i/o errors Backport of > https://review.gluster.org/#/c/glusterfs/+/23024/ >Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f >Fixes: bz#1728770 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Fixes: bz#1749305 Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> (cherry picked from commit d026f0bcfd301712e4f0671ccf238f43f2e6dd30)
* rpc: Update address family if it is not provide in cmd-line argumentsMohit Agrawal2019-09-091-1/+12
| | | | | | | | | | | | | | | | | | | | Problem: After enabling transport-type to inet6 and passed ipv6 transport.socket.bind-address in glusterd.vol clients are not started. Solution: Need to update address-family based on remote-address for all gluster client process > Change-Id: Iaa3588cd87cebc45231bfd675745c1a457dc9b31 > Fixes: bz#1747746 > Credits: Amgad Saleh <amgad.saleh@nokia.com> > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit 80b8cfe3f1386606bada97a76a0cad7acdf6b877) Change-Id: Iaa3588cd87cebc45231bfd675745c1a457dc9b31 Fixes: bz#1749664 Credits: Amgad Saleh <amgad.saleh@nokia.com> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: IPV6 hostname address is not parsed correctlyMohit Agrawal2019-09-061-5/+11
| | | | | | | | | | | | | | | | | | Problem: IPV6 hostname address is not parsed correctly in function glusterd_check_brick_order Solution: Update the code to parse hostname address > Change-Id: Ifb2f83f9c6e987b2292070e048e97eeb51b728ab > Fixes: bz#1747746 > Credits: Amgad Saleh <amgad.saleh@nokia.com> > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit 6563ffb04d7ba51a89726e7c5bbb85c7dbc685b5) Change-Id: Ifb2f83f9c6e987b2292070e048e97eeb51b728ab Fixes: bz#1749664 Credits: Amgad Saleh <amgad.saleh@nokia.com> Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* tests: fix spurious failure of bug-1402841.t-mt-dir-scan-race.tRavishankar N2019-09-051-4/+5
| | | | | | | | | | | | | | | | | | | | | | | Problem: Since commit 600ba94183333c4af9b4a09616690994fd528478, shd starts healing as soon as it is toggled from disabled to enabled. This was causing the following line in the .t to fail on a 'fast' machine (always on my laptop and sometimes on the jenkins slaves). EXPECT_NOT "^0$" get_pending_heal_count $V0 because by the time shd was disabled, the heal was already completed. Fix: Increase the no. of files to be healed and make it a variable called FILE_COUNT, should we need to bump it up further because the machines become even faster. Also created pending metadata heals to increase the time taken to heal a file. fixes: bz#1749155 Change-Id: I5a26b08e45b8c19bce3c01ce67bdcc28ed48198d Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 724c657995a2e148243eeb78c68b620c6d7714a5)
* doc: documented about fips-mode-rchecksumRinku Kothiya2019-09-041-1/+7
| | | | | | | | Updated release notes to document about fips-mode-rchecksum. fixes: bz#1703322 Change-Id: Id6707fca6fc2dbc251f6e00e635a63d9e31f88f7 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* [RFC] change get_real_filename implementation to use ENOATTR instead of ENOENTMichael Adam2019-09-032-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_real_filename is implemented as a virtual extended attribute to help Samba implement the case-insensitive but case preserving SMB protocol more efficiently. It is implemented as a getxattr call on the parent directory with the virtual key of "get_real_filename:<entryname>" by looking for a spelling with different case for the provided file/dir name (<entryname>) and returning this correct spelling as a result if the entry is found. Originally (05aaec645a6262d431486eb5ac7cd702646cfcfb), the implementation used the ENOENT errno to return the authoritative answer that <entryname> does not exist in any case folding. Now this implementation is actually a violation or misuse of the defined API for the getxattr call which returns ENOENT for the case that the dir that the call is made against does not exist and ENOATTR (or the synonym ENODATA) for the case that the xattr does not exist. This was not a problem until the gluster fuse-bridge was changed to do map ENOENT to ESTALE in 59629f1da9dca670d5dcc6425f7f89b3e96b46bf, after which we the getxattr call for get_real_filename returned an ESTALE instead of ENOENT breaking the expectation in Samba. It is an independent problem that ESTALE should not leak out to user space but is intended to trigger retries between fuse and gluster. But nevertheless, the semantics seem to be incorrect here and should be changed. This patch changes the implementation of the get_real_filename virtual xattr to correctly return ENOATTR instead of ENOENT if the file/directory being looked up is not found. The Samba glusterfs_fuse vfs module which takes advantage of the get_real_filename over a fuse mount will receive a corresponding change to map ENOATTR to ENOENT. Without this change, it will still work correctly, but the performance optimization for nonexisting files is lost. On the other hand side, this change removes the distinction between the old not-implemented case and the implemented case. So Samba changed to treat ENOATTR like ENOENT will not work correctly any more against old servers that don't implement get_real_filename. I.e. existing files will be reported as non-existing Change-Id: I971b427ab8410636d5d201157d9af70e0d075b67 fixes: bz#1745914 Signed-off-by: Michael Adam <obnox@samba.org> (cherry picked from commit dc1b87fcfef08c9497b0c02b2410c9d18bbc2dba)
* afr: wake up index healer threadsRavishankar N2019-08-306-11/+67
| | | | | | | | | | | | | | ...whenever shd is re-enabled after disabling or there is a change in `cluster.heal-timeout`, without needing to restart shd or waiting for the current `cluster.heal-timeout` seconds to expire. See BZ 1743988 for more details. Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe fixes: bz#1747301 Reported-by: Glen Kiessling <glenk1973@hotmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 600ba94183333c4af9b4a09616690994fd528478)
* ctime: Fix incorrect realtime passed to frame->root->ctimeKotresh HR2019-08-284-1/+26
| | | | | | | | | | | | | | | | | | | On systems that don't support "timespec_get"(e.g., centos6), it was using "clock_gettime" with "CLOCK_MONOTONIC" to get unix epoch time which is incorrect. This patch introduces "timespec_now_realtime" which uses "clock_gettime" with "CLOCK_REALTIME" which fixes the issue. Backport of: > Patch: https://review.gluster.org/23274/ > Change-Id: I57be35ce442d7e05319e82112b687eb4f28d7612 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > BUG: 1743652 (cherry picked from commit d14d0749340d9cb1ef6fc4b35f2fb3015ed0339d) Change-Id: I57be35ce442d7e05319e82112b687eb4f28d7612 Signed-off-by: Kotresh HR <khiremat@redhat.com> fixes: bz#1746145
* ctime: Fix ctime issue with utime family of syscallsKotresh HR2019-08-274-52/+68
| | | | | | | | | | | | | | | | When atime|mtime is updated via utime family of syscalls, ctime is not updated. This patch fixes the same. Backport of: > Patch: https://review.gluster.org/23177/ > Change-Id: I7f86d8f8a1e06a332c3449b5bbdbf128c9690f25 > BUG: 1738786 > Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 95f71df31dc73d85df722b0e7d3a7eb1e0237e7f) Change-Id: I7f86d8f8a1e06a332c3449b5bbdbf128c9690f25 fixes: bz#1746142 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* fuse: add missing GF_FREE to fuse_interruptCsaba Henk2019-08-271-1/+4
| | | | | | Change-Id: Id7e003e4a53d0a0057c1c84e1cd704c80a6cb015 Fixes: bz#1744874 Signed-off-by: Csaba Henk <csaba@redhat.com>
* glusterd: ./tests/bugs/glusterd/bug-1595320.t is failingMohit Agrawal2019-08-261-1/+1
| | | | | | | | | | | | | | | | | | Problem: sometime ./tests/bugs/glusterd/bug-1595320.t is failing is failing at the time of checking brick_process after sending a kill signal to brick process Solution: Wait sometime after just sending a kill signal to brick process to make sure brick process is stopped > Change-Id: Iee9e91284618abfc62a550d47e4f9117785def58 > Fixes: bz#1743200 > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit 8f1620ad7f5d3d040fee55c5f873349800e2268d) Change-Id: Iee9e91284618abfc62a550d47e4f9117785def58 Fixes: bz#1745422 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* storage/posix: set the op_errno to proper errno during gfid setv7.0rc0Raghavendra Bhat2019-08-221-0/+1
| | | | | | | | | | In posix_gfid_set, the proper error is not captured in one of the failure cases. Change-Id: I1c13f0691a15d6893f1037b3a5fe385a99657e00 Fixes: bz#1736481 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> (cherry picked from commit ed7a3793073670e787063c47e55010fc7c963064)
* cluster/ec: Update lock->good_mask on parent fop failurePranith Kumar K2019-08-222-0/+4
| | | | | | | | | | When discard/truncate performs write fop, it should do so after updating lock->good_mask to make sure readv happens on the correct mask fixes: bz#1739424 Change-Id: Idfef0bbcca8860d53707094722e6ba3f81c583b7 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Fix reopen flags to avoid misbehaviorPranith Kumar K2019-08-222-3/+8
| | | | | | | | | | | | | | | | | | | | | | | Problem: when a file needs to be re-opened O_APPEND and O_EXCL flags are not filtered in EC. - O_APPEND should be filtered because EC doesn't send O_APPEND below EC for open to make sure writes happen on the individual fragments instead of at the end of the file. - O_EXCL should be filtered because shd could have created the file so even when file exists open should succeed - O_CREAT should be filtered because open happens with gfid as parameter. So open fop will create just the gfid which will lead to problems. Fix: Filter out these two flags in reopen. Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4 Fixes: bz#1739426 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Always read from good-maskPranith Kumar K2019-08-222-5/+25
| | | | | | | | | | There are cases where fop->mask may have fop->healing added and readv shouldn't be wound on fop->healing. To avoid this always wind readv to lock->good_mask updates: bz#1739424 Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: fix EIO error for concurrent writes on sparse filesXavi Hernandez2019-08-222-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC doesn't allow concurrent writes on overlapping areas, they are serialized. However non-overlapping writes are serviced in parallel. When a write is not aligned, EC first needs to read the entire chunk from disk, apply the modified fragment and write it again. The problem appears on sparse files because a write to an offset implicitly creates data on offsets below it (so, in some way, they are overlapping). For example, if a file is empty and we read 10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte at offset 1M and retry the same read, the system call will return 10 bytes (all containing 0's). So if we have two writes, the first one at offset 10 and the second one at offset 1M, EC will send both in parallel because they do not overlap. However, the first one will try to read missing data from the first chunk (i.e. offsets 0 to 9) to recombine the entire chunk and do the final write. This read will happen in parallel with the write to 1M. What could happen is that half of the bricks process the write before the read, and the half do the read before the write. Some bricks will return 10 bytes of data while the otherw will return 0 bytes (because the file on the brick has not been expanded yet). When EC tries to recombine the answers from the bricks, it can't, because it needs more than half consistent answers to recover the data. So this read fails with EIO error. This error is propagated to the parent write, which is aborted and EIO is returned to the application. The issue happened because EC assumed that a write to a given offset implies that offsets below it exist. This fix prevents the read of the chunk from bricks if the current size of the file is smaller than the read chunk offset. This size is correctly tracked, so this fixes the issue. Also modifying ec-stripe.t file for Test #13 within it. In this patch, if a file size is less than the offset we are writing, we fill zeros in head and tail and do not consider it strip cache miss. That actually make sense as we know what data that part holds and there is no need of reading it from bricks. Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 Fixes: bz#1739427 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* cluster/ec: inherit healing from lock when it has infoKinglong Mee2019-08-221-2/+3
| | | | | | | | | If lock has info, fop should inherit healing mask from it. Otherwise, fop cannot inherit right healing when changed_flags is zero. Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f updates: bz#1739424 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
* afr: restore timestamp of parent dir during entry-healRavishankar N2019-08-212-0/+80
| | | | | | | Fixes: bz#1741041 Change-Id: I29e338bac62104233a6f80212df8d0fb016affda Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 8e9c53ebf16705b9a1db2fc486dc24a5cb244ddd)
* features/shard: Send correct size when reads are sent beyond file sizeKrutika Dhananjay2019-08-212-0/+31
| | | | | | | Change-Id: I0cebaaf55c09eb1fb77a274268ff564e871b743b fixes bz#1740316 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit 51237eda7c4b3846d08c5d24d1e3fe9b7ffba1d4)
* doc: Added initial release notes for release-7Rinku Kothiya2019-08-211-0/+228
| | | | | | | updates: bz#1732875 Change-Id: I12f4c1e74e238c9906f5068ccbde38343ca14ff5 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* event: rename event_XXX with gf_ prefixedXiubo Li2019-08-2115-90/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit one crash issue when using the libgfapi. In the libgfapi it will call glfs_poller() --> event_dispatch() in file api/src/glfs.c:721, and the event_dispatch() is defined by libgluster locally, the problem is the name of event_dispatch() is the extremly the same with the one from libevent package form the OS. For example, if a executable program Foo, which will also use and link the libevent and the libgfapi at the same time, I can hit the crash, like: kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp 00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000] The link for Foo is: lib_foo_LADD = -levent $(GFAPI_LIBS) It will crash. This is because the glfs_poller() is calling the event_dispatch() from the libevent, not the libglsuter. The gfapi link info : GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid If I link Foo like: lib_foo_LADD = $(GFAPI_LIBS) -levent It will works well without any problem. And if Foo call one private lib, such as handler_glfs.so, and the handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't and it will dlopen(handler_glfs.so), then the crash will be hit everytime. The link info will be: foo_LADD = -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like: foo_LADD = $(GFAPI_LIBS) -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS. And in some cases when the --as-needed link option is added(on many dists it is added as default), then the crash is back again, the above workaround won't work. Backport of: > https://review.gluster.org/#/c/glusterfs/+/23110/ > Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa > Fixes: #699 > Signed-off-by: Xiubo Li <xiubli@redhat.com> Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa updates: bz#1740519 Signed-off-by: Xiubo Li <xiubli@redhat.com> (cherry picked from commit 799edc73c3d4f694c365c6a7c27c9ab8eed5f260)
* locks/fencing: Address hang while lock preemptionSusant Palai2019-08-213-20/+29
| | | | | | | | | | | | | | | | | The fop_wind_count can go negative when fencing is enabled on unwind path of the IO leading to hang. Also changed code so that fop_wind_count needs to be maintained only till fencing is enabled on the file. > updates: bz#1717824 > Change-Id: Icd04b42bc16cd3d50eaa581ee57233910194f480 > signed-off-by: Susant Palai <spalai@redhat.com> (backport of https://review.gluster.org/#/c/glusterfs/+/23088/) fixes: bz#1740077 Change-Id: Icd04b42bc16cd3d50eaa581ee57233910194f480 Signed-off-by: Susant Palai <spalai@redhat.com>
* rpc: glusterd start is failed and throwing an error Address already in useMohit Agrawal2019-08-191-7/+37
| | | | | | | | | | | | | | | | | | | | Problem: Some of the .t are failed due to bind is throwing an error EADDRINUSE Solution: After killing all gluster processes .t is trying to start glusterd but somehow if kernel has not cleaned up resources(socket) then glusterd startup is failed due to bind system call failure.To avoid the issue retries to call bind 10 times to execute system call succesfully > Change-Id: Ia5fd6b788f7b211c1508c1b7304fc08a32266629 > Fixes: bz#1743020 > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit c370c70f77079339e2cfb7f284f3a2fb13fd2f97) Change-Id: Ia5fd6b788f7b211c1508c1b7304fc08a32266629 Fixes: bz#1743218 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* features/utime: always update ctime at setattrKinglong Mee2019-08-192-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | For the nfs EXCLUSIVE mode create may sets a later time to mtime (at verifier), it should not set to ctime for storage.ctime does not allowed set ctime to a earlier time. /* Earlier, mdata was updated only if the existing time is less * than the time to be updated. This would fail the scenarios * where mtime can be set to any time using the syscall. Hence * just updating without comparison. But the ctime is not * allowed to changed to older date. */ According to kernel's setattr, always set ctime at setattr, and doesnot set ctime from mtime at storage.ctime. Backport of: > Patch: https://review.gluster.org/23154 > Change-Id: I5cfde6cb7f8939da9617506e3dc80bd840e0d749 > BUG: 1737288 > Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Change-Id: I5cfde6cb7f8939da9617506e3dc80bd840e0d749 fixes: bz#1739437 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix race during lookup ctime xattr healKotresh HR2019-08-191-18/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Ctime heals the ctime xattr ("trusted.glusterfs.mdata") in lookup if it's not present. In a multi client scenario, there is a race which results in updating the ctime xattr to older value. e.g. Let c1 and c2 be two clients and file1 be the file which doesn't have the ctime xattr. Let the ctime of file1 be t1. (from backend, ctime heals time attributes from backend when not present). Now following operations are done on mount c1 -> ls -l /mnt/file1 | c2 -> ls -l /mnt/file1;echo "append" >> /mnt/file1; The race is that the both c1 and c2 didn't fetch the ctime xattr in lookup, so both of them tries to heal ctime to time 't1'. If c2 wins the race and appends the file before c1 heals it, it sets the time to 't1' and updates it to 't2' (because of append). Now c1 proceeds to heal and sets it to 't1' which is incorrect. Solution: Compare the times during heal and only update the larger time. This is the general approach used in ctime feature but got missed with healing legacy files. Backport of: > Patch: https://review.gluster.org/23131 > BUG: 1734299 > Change-Id: I930bda192c64c3d49d0aed431ce23d3bc57e51b7 > Signed-off-by: Kotresh HR <khiremat@redhat.com> fixes: bz#1739436 Change-Id: I930bda192c64c3d49d0aed431ce23d3bc57e51b7 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* features/utime: Fix mem_put crashPranith Kumar K2019-08-191-1/+3
| | | | | | | | | | | | | | | | | | | | Problem: When frame->local is not null FRAME_DESTROY calls mem_put on it. Since the stub is already destroyed in call_resume(), it leads to crash Fix: Set frame->local to NULL before calling call_resume() Backport of: > Patch: https://review.gluster.org/23091 > BUG: 1593542 > Change-Id: I0f8adf406f4cefdb89d7624ba7a9d9c2eedfb1de > Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> fixes: bz#1739430 Change-Id: I0f8adf406f4cefdb89d7624ba7a9d9c2eedfb1de Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Set mdata xattr on legacy filesKotresh HR2019-08-1916-57/+475
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The files which were created before ctime enabled would not have "trusted.glusterfs.mdata"(stores time attributes) xattr. Upon fops which modifies either ctime or mtime, the xattr gets created with latest ctime, mtime and atime, which is incorrect. It should update only the corresponding time attribute and rest from backend Solution: Creating xattr with values from brick is not possible as each brick of replica set would have different times. So create the xattr upon successful lookup if the xattr is not created Note To Reviewers: The time attributes used to set xattr is got from successful lookup. Instead of sending the whole iatt over the wire via setxattr, a structure called mdata_iatt is sent. The mdata_iatt contains only time attributes. Backport of: > Patch: https://review.gluster.org/22936 > Change-Id: I5e535631ddef04195361ae0364336410a2895dd4 > BUG: 1593542 > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: I5e535631ddef04195361ae0364336410a2895dd4 updates: bz#1739430 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* geo-rep: Fix mount broker setup issueKotresh HR2019-08-191-4/+7
| | | | | | | | | | | | | | | | | | | | | Even the use builtin 'type' command as in patch [1] causes issues if argument in question is not part of PATH environment variable for that user. This patch fixes the same by doing source /etc/profile. This was already being used in another part of script. [1] https://review.gluster.org/23089 Backport of: > Patch: https://review.gluster.org/23136/ > Change-Id: Iceb78835967ec6a4350983eec9af28398410c002 > BUG: 1734738 > Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 84f7794547522463841068063b22fd3a8d8fca2b) Change-Id: Iceb78835967ec6a4350983eec9af28398410c002 fixes: bz#1739442 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix : add posix_set_ctime() in posix_ftruncate()Jiffin Tony Thottan2019-08-091-0/+2
| | | | | | | | | | | >Backport of https://review.gluster.org/#/c/glusterfs/+/22948/ >Change-Id: I0cb5320fea71306e0283509ae47024f23874b53b >fixes: bz#1723761 Change-Id: I0cb5320fea71306e0283509ae47024f23874b53b fixes: bz#1739399 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> (cherry picked from commit 7d8be567f2f904fc74d0990ebce2e8afbedab918)
* cluster/dht: Fixed a memleak in dht_rename_cbkN Balachandran2019-08-091-11/+33
| | | | | | | | | Fixed a memleak in dht_rename_cbk when creating a linkto file. Change-Id: I705adef3cb79e33806520fc2b15558e90e2c211c fixes: bz#1739337 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* protocol/client: propagte GF_EVENT_CHILD_PING only for connections to brickRaghavendra G2019-08-091-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Two reasons: * ping responses from glusterd may not be relevant for Halo replication. Instead, it might be interested in only knowing whether the brick itself is responsive. * When a brick is killed, propagating GF_EVENT_CHILD_PING of ping response from glusterd results in GF_EVENT_DISCONNECT spuriously propagated to parent xlators. These DISCONNECT events are from the connections client establishes with glusterd as part of its reconnect logic. Without GF_EVENT_CHILD_PING, the last event propagated to parent xlators would be the first DISCONNECT event from brick and hence subsequent DISCONNECTS to glusterd are not propagated as protocol/client prevents same event being propagated to parent xlators consecutively. propagating GF_EVENT_CHILD_PING for ping responses from glusterd would change the last_sent_event to GF_EVENT_CHILD_PING and hence protocol/client cannot prevent subsequent DISCONNECT events Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1739334 Change-Id: I50276680c52f05ca9e12149a3094923622d6eaef (cherry picked from commit 5d66eafec581fb3209af74595784be8854ca40a4)
* gfapi: Fix deadlock while processing upcallSoumya Koduri2019-08-021-33/+131
| | | | | | | | | | | | | | | | | | | As mentioned in bug1733166, there could be potential deadlock while processing upcalls depending on how each xlator choose to act on it. The right way of fixing such issues is to change rpc callback communication process. - https://github.com/gluster/glusterfs/issues/697 Till then, making changes in gfapi layer to avoid any I/O processing. This is backport of below mainline patch > https://review.gluster.org/#/c/glusterfs/+/23108/ > bz#1733166 Change-Id: I2079e95339e5d761d5060707f4555cfacab95c83 fixes: bz#1736345 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* glusterd: don't log a warning message for tier-enabled keySanju Rakonde2019-07-291-2/+5
| | | | | | | | | | | | | | We are logging a warning message saying unknown-key for tier-enabled kay. although the tier xlator is deprecated, this key is left behind for handling the peer rejection issues in a heterogeneous cluster. We need not to log if this key is not found/recognised. updates: bz#1727008 Change-Id: Ia68661898a618f99a240ca8d8a124ff6a65ebe9d Signed-off-by: Sanju Rakonde <srakonde@redhat.com> (cherry picked from commit 7d270d05f835d41e32572501246b50181dc9be56)
* api: Update all future API versions to rel-7Rinku Kothiya2019-07-244-4/+4
| | | | | | | | | As release 7 is branched, all future APIs now become 7.0 This change implements the same. Change-Id: I420e9ea43b07a40a4fb911ee56fb7436c2744edc Updates: bz#1732875 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* features/snapview-server: obtain the list of snapshots inside the lockRaghavendra Bhat2019-07-241-1/+1
| | | | | | | | | The current list of snapshots from priv->dirents is obtained outside the lock. Change-Id: I8876ec0a38308da5db058397382fbc82cc7ac177 Fixes: bz#1731512 (cherry picked from commit 8e795617fd6f5193d0d52a336059ce1a28108c0e)
* cluster/ta: Notify the clients only if there are pending healskarthik-us2019-07-244-22/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In case of thin arbiter, before index healer starts crawling the indices at every heal-timeout interval, even if there is nothing to be healed it will send an upcall notification to all the clients to release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait for the upcall to return before proceeding with the heal even though there is nothing to be healed. This will also invalidates the cached information about the bricks states on the clients which leads to extra calls on TA from clients for the next reads & writes if needed. This will impact the IO performance. Fix: - Before sending the upcall to the clients, check for any pending heals on TA without taking any locks. - If there is nothing marked bad on TA, then continue with the index crawl to heal any dirty markings present on the files due to any post-op failure. - If there is a brick marked as bad on TA, then take the AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and continue with the current healing process. Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6 fixes: bz#1729483 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* glusterd/svc: update pid of mux volumes from the shd processMohammed Rafi KC2019-07-2415-52/+325
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a normal volume, we are updating the pid from a the process while we do a daemonization or at the end of the init if it is no-daemon mode. Along with updating the pid we also lock the file, to make sure that the process is running fine. With brick mux, we were updating the pidfile from gluterd after an attach/detach request. There are two problems with this approach. 1) We are not holding a pidlock for any file other than parent process. 2) There is a chance for possible race conditions with attach/detach. For example, shd start and a volume stop could race. Let's say we are starting an shd and it is attached to a volume. While we trying to link the pid file to the running process, this would have deleted by the thread that doing a volume stop. Backport of : https://review.gluster.org/#/c/glusterfs/+/22935/ >Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a >Updates: bz#1722541 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a Updates: bz#1732668 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* graph/shd: Use top down approach while cleaning xlatorMohammed Rafi KC2019-07-2410-2/+21
| | | | | | | | | | | | | | | | | | | We were cleaning xlator from botton to top, which might lead to problems when upper xlators trying to access the xlator object loaded below. One such scenario is when fd_unref happens as part of the fini call which might lead to calling the releasedir to lower xlator. This will lead to invalid mem access Backport of:https://review.gluster.org/#/c/glusterfs/+/22968/ >Change-Id: I8a6cb619256fab0b0c01a2d564fc88287c4415a0 >Updates: bz#1716695 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: I8a6cb619256fab0b0c01a2d564fc88287c4415a0 Updates: bz#1730229 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* graph/shd: Use glusterfs_graph_deactivate to free the xl recMohammed Rafi KC2019-07-243-2/+12
| | | | | | | | | | | | | | | | | We were using glusterfs_graph_fini to free the xl rec from glusterfs_process_volfp as well as glusterfs_graph_cleanup. Instead we can use glusterfs_graph_deactivate, which is does fini as well as other common rec free. Backportof : https://review.gluster.org/22904 >Change-Id: Ie4a5f2771e5254aa5ed9f00c3672a6d2cc8e4bc1 >Updates: bz#1716695 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: Ie4a5f2771e5254aa5ed9f00c3672a6d2cc8e4bc1 Updates: bz#1730229 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* core: fix deadlock between statedump and fd_anonymous()Xavi Hernandez2019-07-201-76/+61
| | | | | | | | | | | | | | | | | | There exists a deadlock between statedump generation and fd_anonymous() function because they are acquiring inode table lock and inode lock in reverse order. This patch modifies fd_anonymous() so that it takes inode lock only when it's really necessary, avoiding the deadlock. Backport of: > Change-Id: I24355447f0ea1b39e2546782ad07f0512cc381e7 > BUG: 1727068 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Change-Id: I24355447f0ea1b39e2546782ad07f0512cc381e7 Fixes: bz#1729950 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: conditionally clear txn_opinfo in stage opAtin Mukherjee2019-07-201-2/+10
| | | | | | | | | | | | | | ...otherwise this leads to a crash when volume status is run on a heterogeneous mode. > Fixes: bz#1723658 > Change-Id: I0d39f412b2e5e9d3ef0a3462b90b38bb5364b09d > Signed-off-by: Atin Mukherjee <amukherj@redhat.com> > (cherry picked from commit a72452fcf90679b28baec12d2769cbaa982bb4e4) Fixes: bz#1728127 Change-Id: I0d39f412b2e5e9d3ef0a3462b90b38bb5364b09d Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/afr: Fix incorrect reporting of gfid & type mismatchkarthik-us2019-07-203-2/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: 1. When checking for type and gfid mismatch, if the type or gfid is unknown because of missing gfid handle and the gfid xattr it will be reported as type or gfid mismatch and the heal will not complete. 2. If the source selected during entry heal has null gfid the same will be sent to afr_lookup_and_heal_gfid(). In this function when we try to assign the gfid on the bricks where it does not exist, we are considering the same gfid and try to assign that on those bricks. This will fail in posix_gfid_set() since the gfid sent is null. Fix: If the gfid sent to afr_lookup_and_heal_gfid() is null choose a valid gfid before proceeding to assign the gfid on the bricks where it is missing. In afr_selfheal_detect_gfid_and_type_mismatch(), do not report type/gfid mismatch if the type/gfid is unknown or not set. Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da fixes: bz#1729481 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* glusterd: do not mark skip_locking as true for geo-rep operationsSanju Rakonde2019-07-192-2/+10
| | | | | | | | | | | | | | | | | | We need to send the commit req to peers in case of geo-rep operations even though it is a no volname operation. In commit phase peers try to set the txn_opinfo which will fail because it is a no volname operation where we don't require a commit phase. We mark skip_locking as true for no volname operations, but we have to give an exception to geo-rep operations, so that they can set txn_opinfo in commit phase. Please refer to detailed RCA at the bug: 1730543 fixes: bz#1730543 Change-Id: I9f2478b12a281f6e052035c0563c40543493a3fc Signed-off-by: Sanju Rakonde <srakonde@redhat.com> (cherry picked from commit b917974ee922d7a2e079692ad7d6f61f900b37b2)
* features/snapview-server: use the same volfile server for gfapi optionsRaghavendra Bhat2019-07-182-4/+42
| | | | | | | | | | | | snapview server xlator makes use of "localhost" as the volfile server while initing the new glfs instance to talk to a snapshot. While localhost is fine, better use the same volfile server that was used to start the snapshot daemon containing the snapview-server xlator. Change-Id: I4485d39b0e3d066f481adc6958ace53ea33237f7 fixes: bz#1728391 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> (cherry picked from commit 36f6e6df0ff15d0464b869803710adca2b65e8ba)
* glusterd: Show the correct brick status in get-stateMohit Agrawal2019-07-153-2/+37
| | | | | | | | | | | | | | Problem: get-state does not show correct brick status if brick status is not Started, it always shows started if any value is set brickinfo->status Solution: Check the value of brickinfo->status to show correct status in get-state Change-Id: I12a79619024c2cf59f338220d144f2f034059b3b fixes: bz#1726905 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> (cherry picked from commit af989db23d1db00e087f2b9d3dfc43b13ef17153)
* tests: Fix bug-1717819-metadata-split-brain-detection.t failurekarthik-us2019-07-151-0/+6
| | | | | | | | | | | | | | | | | | Problem: tests/bugs/replicate/bug-1717819-metadata-split-brain-detection.t fails intermittently in test cases #49 & #50, which compare the values of the user set xattr values after enabling the heal. We are not waiting for the heal to complete before comparing those values, which might lead those tests to fail. Fix: Wait till the HEAL-TIMEOUT before comparing the xattr values. Also cheking for the shd to come up and the bricks to connect to the shd process in another case. Change-Id: I0e245b328da9df23ce70c5300278fad1c1d9f7ff fixes: bz#1729895 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* test: Fix spurious failures in bug-1040275-brick-uid-reset-on-volume-restart.tMohit Agrawal2019-07-091-0/+8
| | | | | | | | | | | | | | | | | | Problem: test case is failing after just starting the volume at the time of running stat command on mount point and client is getting error transport endpoint is not conencted Solution: To avoid the error make sure all brick instance should be up and mount point should be active > Change-Id: I49553a04d5b13e155ee02f4a1888a07fe3ee2ff5 > fixes: bz#1721590 > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit 283b77805cca3027e333a11c9b00ac611662c9ee) Change-Id: I49553a04d5b13e155ee02f4a1888a07fe3ee2ff5 fixes: bz#1728182 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>