summaryrefslogtreecommitdiffstats
path: root/xlators/performance
Commit message (Collapse)AuthorAgeFilesLines
* xlators: add classification flag to someAmar Tumballi2018-09-043-0/+3
| | | | | | | | | Add classification to those translators which has `xlator_api_t` already defined and used. Updates: #430 Change-Id: I9d2772cb2c4ed4ab06aaa546500cf3b7d00bddac Signed-off-by: Amar Tumballi <amarts@redhat.com>
* IO cache : fix coverity issue in page.cSunny Kumar2018-09-041-3/+3
| | | | | | | | This patch fixes CID 1382439 and 1382412. Change-Id: I8696623c168ba76ae2ecac7c9582b4e50437bc53 updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* Multiple files: calloc -> mallocYaniv Kaul2018-09-043-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/storage/posix/src/posix-inode-fd-ops.c: xlators/storage/posix/src/posix-helpers.c: xlators/storage/bd/src/bd.c: xlators/protocol/client/src/client-lk.c: xlators/performance/quick-read/src/quick-read.c: xlators/performance/io-cache/src/page.c xlators/nfs/server/src/nfs3-helpers.c xlators/nfs/server/src/nfs-fops.c xlators/nfs/server/src/mount3udp_svc.c xlators/nfs/server/src/mount3.c xlators/mount/fuse/src/fuse-helpers.c xlators/mount/fuse/src/fuse-bridge.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-syncop.h xlators/mgmt/glusterd/src/glusterd-snapshot.c xlators/mgmt/glusterd/src/glusterd-rpc-ops.c xlators/mgmt/glusterd/src/glusterd-replace-brick.c xlators/mgmt/glusterd/src/glusterd-op-sm.c xlators/mgmt/glusterd/src/glusterd-mgmt.c xlators/meta/src/subvolumes-dir.c xlators/meta/src/graph-dir.c xlators/features/trash/src/trash.c xlators/features/shard/src/shard.h xlators/features/shard/src/shard.c xlators/features/marker/src/marker-quota.c xlators/features/locks/src/common.c xlators/features/leases/src/leases-internal.c xlators/features/gfid-access/src/gfid-access.c xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c xlators/features/bit-rot/src/bitd/bit-rot.c xlators/features/bit-rot/src/bitd/bit-rot-scrub.c bxlators/encryption/crypt/src/metadata.c xlators/encryption/crypt/src/crypt.c xlators/performance/md-cache/src/md-cache.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! .. and allocate memory as much as needed. xlators/nfs/server/src/mount3.c : Don't blindly allocate PATH_MAX, but strlen() the string and allocate appropriately. Also, align error messges. updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
* multiple xlators: move from strlen() to sizeof()Yaniv Kaul2018-08-312-3/+3
| | | | | | | | | | | | | | | xlators/performance/nl-cache/src/nl-cache.c xlators/performance/md-cache/src/md-cache.c xlators/protocol/server/src/authenticate.c xlators/storage/bd/src/bd-helper.c For const strings, just do compile time size calc instead of runtime. Compile-tested only! Change-Id: I9b98940a38d85321a69436a1871930da367b918a updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* IO cache : fix coverity issues in io-cache.cSunny Kumar2018-08-301-3/+8
| | | | | | | | This patch fixes CID 1382361, 1124714 and 1382432. Change-Id: I0407f35ee44ec6e4522de46092658223d0c8ee6a updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* performance/write-behind: fix fulfill and readdirp raceRaghavendra G2018-08-231-33/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current invalidation of stats in wb_readdirp_cbk is prone to races. As the deleted comment explains, <snip> We cannot guarantee integrity of entry->d_stat as there are cached writes. The stat is most likely stale as it doesn't account the cached writes. However, checking for non-empty liability list here is not a fool-proof solution as there can be races like, 1. readdirp is successful on posix 2. sync of cached write is successful on posix 3. write-behind received sync response and removed the request from liability queue 4. readdirp response is processed at write-behind. In the above scenario, stat for the file is sent back in readdirp response but it is stale. </snip> The fix is to mark readdirp sessions (tracked in this patch by non-zero value of "readdirps" on parent inode) and if fulfill completes when one or more readdirp sessions are in progress, mark the inode so that wb_readdirp_cbk doesn't send iatts for that in inode in readdirp response. Note that wb_readdirp_cbk already checks for presence of a non-empty liability queue and invalidates iatt. Since the only way a liability queue can shrink is by fulfilling requests in liability queue, wb_fulfill_cbk indicates wb_readdirp_cbk that a potential race could've happened b/w readdirp and fulfill. Change-Id: I12d167bf450648baa64be1cbe1ca0fddf5379521 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> updates: bz#1512691
* Revert "performance/write-behind: better invalidation in readdirp"Raghavendra G2018-08-211-28/+23
| | | | | | | | | | | This reverts commit 4d3c62e71f3250f10aa0344085a5ec2d45458d5c. Traversing all children of a directory in wb_readdirp caused significant performance regression. Hence reverting this patch Change-Id: I6c3b6cee2dd2aca41d49fe55ecdc6262e7cc5f34 updates: bz#1512691 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* write-behind: coverity fixesBhumika Goyal2018-08-201-3/+7
| | | | | | | | Fixes CID: 1124360 1291740 1370918 Change-Id: I008c7ade8f9809d040f42f6d3e9af70fff2f3dc6 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
* performance/readdir-ahead: keep stats of cached dentries in sync with ↵Krutika Dhananjay2018-08-183-20/+606
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | modifications PROBLEM: Stats of dentries that are readdirp'd ahead can become stale due to fops like writes, truncate etc that modify the file pointed by dentries. When a readdir is finally wound at offset corresponding to these entries, the iatts that are returned to the application come from readdir-ahead's cache, which are stale by now. This problem gets further aggravated when caching translators/modules cache and continue to serve this stale information. FIX: * Store the iatt in context of the inode pointed by dentry. * Whenever the inode pointed by dentry undergoes modification, in cbk of modification fop, update the iatt stored in inode-ctx to reflect the modification. * When serving a readdirp response from application, update iatts of dentries with the iatts stored in the context of inodes pointed by these dentries. * Some fops don't have valid iatts in their responses. For eg., write response whose data is still cached in write-behind will have zeroed out stat. In this case keep only ia_type and ia_gfid and reset rest of the iatt members to zero. - fuse-bridge in this case just sends "entry" information back to kernel and attr is not sent. - gfapi sets entry->inode to NULL and zeroes out the entire stat * There is one tiny race between the entry creation and a readdirp on its parent dir, which could cause the inode-ctx setting and inode ctx reading to happen on two different inode objects. To prevent this, when entry->inode doesn't eqaul to linked_inode, - fuse-bridge is made to send only "entry" information without attributes - gfapi sets entry->inode to NULL and zeroes out the entire stat. Change-Id: Ia27ff49a61922e88c73a1547ad8aacc9968a69df BUG: 1390050 Updates: bz#1390050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* performance/md-cache: Use bitwise AND instead of logical ANDVijay Bellur2018-08-161-1/+1
| | | | | | | | Addresses CID: 1394640 Change-Id: I1139222301569d17760df74624acd301594063b9 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* performance/quick-read: handle rollover of generation counterRaghavendra G2018-08-132-36/+108
| | | | | | Change-Id: I37a6e0efda430b70d03dd431c35bef23b3d16361 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/quick-read: don't update with stale data after invalidationRaghavendra G2018-08-042-44/+233
| | | | | | | | | | | | Once invalidated, make sure that only ops incident after invalidation update the cache. This makes sure that ops before invalidation don't repopulate cache with stale data. This patch also uses an internal counter instead of frame->root->unique for keeping track of generations. Change-Id: I6b38b141985283bd54b287775f3ec67b88bf6cb8 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* Revert "performance/readdir-ahead: Invalidate cached dentries if they're ↵Raghavendra G2018-08-033-597/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | modified while in cache" This reverts commit 7131de81f72dda0ef685ed60d0887c6e14289b8c. With the latest master, I created a single brick volume and some files inside it. [root@rhgs313-6 ~]# umount -f /mnt/fuse1; mount -t glusterfs -s 192.168.122.6:/thunder /mnt/fuse1; ls -l /mnt/fuse1/; echo "Trying again"; ls -l /mnt/fuse1 umount: /mnt/fuse1: not mounted total 0 ----------. 0 root root 0 Jan 1 1970 file-1 ----------. 0 root root 0 Jan 1 1970 file-2 ----------. 0 root root 0 Jan 1 1970 file-3 ----------. 0 root root 0 Jan 1 1970 file-4 ----------. 0 root root 0 Jan 1 1970 file-5 d---------. 0 root root 0 Jan 1 1970 subdir Trying again total 3 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-1 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-2 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-3 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-4 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-5 d---------. 0 root root 0 Jan 1 1970 subdir [root@rhgs313-6 ~]# Conversation can be followed on gluster-devel on thread with subj: tests/bugs/distribute/bug-1122443.t - spurious failure. git-bisected pointed this patch as culprit. Change-Id: I1eb46f6c196f44fde8ce991840a0e724e6f50862 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1390050
* performance/ob: stringent synchronization between rename/unlink and openRaghavendra G2018-08-032-67/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue 1: ======== open all pending fds before resuming rename and unlink currently ob uses fd_lookup to find out the opened-behind. But, fd_lookup gives the recent fd opened on the inode, but the oldest fd(s) (there can be multiple fds opened-behind when the very first opens on an inode are issued in parallel) are the candidates for fds with pending opens on backend. So, this patch explictily tracks the opened-behind fds on an inode and opens them before resuming rename or unlink. similar code changes are also done for setattr and setxattr to make sure pending opens are complete before permission change. This patch also adds a check for an open-in-progress to ob_get_wind_fd. If there is already an open-in-progress, ob_get_wind_fd won't return an anonymous fd as a result. This is done to make sure that rename/unlink/setattr/setxattr don't race with an operation like readv/fstat on an anonymous fd already in progress. Issue 2: ======== once renamed/unlinked, don't open-behind any future opens on the same inode. Issue 3: ======== Don't use anonymous fds by default. Note that rename/unlink can race with a read/fd on anonymous fds and these operations can fail with ESTALE. So, for better consistency in default mode, don't use anonymous fds. If performance is needed with tradeoff of consistency, one can switch on the option "use-anonymous-fd" Change-Id: Iaf130db71ce61ac37269f422e348a45f6ae6e82c Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/open-behind: don't use anonymous fds for reads by defaultRaghavendra G2018-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | anonymous fds interfere with working of read-ahead as read-ahead won't be able to store its cache in fd. Also, as seen in bz 1455872, anonymous fds also affect performance of large file sequential reads as the cost of opening fd for each read on brick stack is significant. So, have a proper fd which enables read-ahead to store its cache and brick stack to reuse the fd during reads. With this change test tests/bugs/snapshot/bug-1167580-set-proper-uid-and-gid-during-nfs-access.t fails consistently. The failure can also be seen with open-behind off. bz 1611532 has been filed to track the issue with test. Thanks to Rafi <rkavunga@redhat.com> for assistance provided in debugging test failure. Change-Id: Ifa52d8ff017f115e83247f3396b9d27f0295ce3f Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1455872
* performance/md-cache: update cache only from fops issued after previous ↵Raghavendra G2018-08-021-98/+238
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | invalidation Invalidations are triggered mainly by two codepaths - upcall and write-behind unwinding a cached write with zeroed out stat. For the case of upcall, following race can happen: * stat s1 is fetched from brick * invalidation is detected on brick * invalidation is propagated to md-cache and cache is invalidated * s1 updates md-cache with a stale state For the case of write-behind, imagine following sequence of operations, * A stat s1 was issued from application thread t1 when size of file was s1 * stat s1 completes on brick stack, but yet to reach md-cache * A write w1 from application thread t2 extends file to size s2 is cached in write-behind and response is unwound with zeroed out stat * md-cache while handling write-cbk, invalidates cache * md-cache receives response for s1, updates cache with stale stat with size of s1 overwriting invalidation state Fix is to remember when s1 was incident on md-cache and update cache with results of s1 only if the it was incident after invalidation of cache. This patch identified some bugs in regression tests which is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1608158. As a stop gap measure I am marking following tests as bad basic/afr/split-brain-resolution.t bugs/bug-1368312.t bugs/replicate/bug-1238398-split-brain-resolution.t bugs/replicate/bug-1417522-block-split-brain-resolution.t bugs/replicate/bug-1438255-do-not-mark-self-accusing-xattrs.t Change-Id: Ia4bb9dd36494944e2d91e9e71a79b5a3974a8c77 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/write-behind: synchronize rename with cached writes on srcRaghavendra G2018-08-021-0/+40
| | | | | | | | | | | rename response contains a postbuf stat of src-inode. Since md-cache caches stat in rename codepath too, we've to make sure stat accounts any cached writes in write-behind. So, we make sure rename is resumed only after any cached writes are committed to backend. Change-Id: Ic9f2adf8edd0b58ebaf661f3a8d0ca086bc63111 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/write-behind: better invalidation in readdirpRaghavendra G2018-07-281-23/+28
| | | | | | | | | | | | | | | | | | | | | | | | Current invalidation of stats in wb_readdirp_cbk is prone to races. As the deleted comment explains, <snip> We cannot guarantee integrity of entry->d_stat as there are cached writes. The stat is most likely stale as it doesn't account the cached writes. However, checking for non-empty liability list here is not a fool-proof solution as there can be races like, 1. readdirp is successful on posix 2. sync of cached write is successful on posix 3. write-behind received sync response and removed the request from liability queue 4. readdirp response is processed at write-behind. In the above scenario, stat for the file is sent back in readdirp response but it is stale. </snip> Change-Id: I6ce170985cc6ce3df2382ec038dd5415beefded5 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/readdir-ahead: Invalidate cached dentries if they're modified ↵Krutika Dhananjay2018-07-283-19/+597
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while in cache PROBLEM: Entries that are readdirp'd ahead can undergo modification in terms of writes, truncates which could modify their iatts. When a readdir is finally wound at offset corresponding to these entries, the iatts that are returned to the application come from readdir-ahead's cache, which are stale by now. This problem gets further aggravated when caching translators/modules cache and continue to serve this stale information. FIX: Whenever a dentry undergoes modification, in the cbk of the modification fop, a "dirty" flag (default 0) is set in its inode ctx. When it's time for readdir-ahead to serve these entries, it will read the inode ctx and check if the entry is "dirty", and if it is, set the entry's attrs to all zeroes, as an indicator to fuse, md-cache etc not to cache these attributes. Also there is one tiny race between the entry creation and a readdirp on its parent dir, which could cause the inode-ctx setting and inode ctx reading to happen on two different inode objects. To prevent this, fuse-bridge is made to drop entries for which dentry->inode is not the same as linked inode, in readdirp cbk. Change-Id: If7396507632b5268442ca580473d5155fee9cbef BUG: 1390050 Updates: bz#1390050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* core (named threads): flood of -Wformat-truncation warnings with gcc-7.1Kaleb S. KEITHLEY2018-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Starting in Fedora 26 which has gcc-7.1.x, -Wformat-trunction is enabled with -Wformat, resulting in a flood of new warnings. This many warnings is a concern because it makes it hard(er) to see other warnings that should be addressed. An example is at https://kojipkgs.fedoraproject.org//packages/glusterfs/3.12.0/1.fc28/data/logs/x86_64/build.log For more info see https://review.gluster.org/#/c/18267/ I can't find much (or good) documentation on the heuristics the compiler uses for this warning. In the case of printing integer types it appears it looks at the available space in the destination and the range of values for the variable and/or its type. To address the specific question about why 0x3ff versus 0xfff to mask the value, either would suffice to hint to the compiler that the printed value will fit in three characters. But the loop is from 0...1023 (or 0...0x3ff if you prefer) so I chose that as a more "accurate" mask to use as it exactly matches the range of values of the loop. Fixes: bz#1492847 Change-Id: I6e309ba42159841131d8241bfc0566ef09e00aa9
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-224-10/+10
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* performance/read-ahead: stricter adherence to force-atime-updateRaghavendra G2018-07-191-12/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Throwaway read-ahead cache in fstat only if force-atime-update is set. Note that fstat flushes read-ahead cache only for atime consistency. However if atime consistency is needed user is required to set force-atime-update which updates atime on backend fs even though application reads are served from read-ahead cache. So, if user has not set force-atime-update, atime won't be accurate and there is no point in flushing read-ahead cache in fstats. mounts requiring atime consistency have to mandatorily set force-atime-update. Also note that normally kernel interspers reads with fstat. So, read-ahead is not effective as fstats flush read-ahead-cache. Instead it regresses performance due to wasted network reads. It is recommended to turn off read-ahead if applications require atime consistency. This patch is aimed at applications which don't require atime consistency. Without atime consistency required, read-ahead cache is effective and increases performance of sequential reads. Change-Id: I122bbc410cee96661823f9c4b934383495c18446 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1601166
* md-cache: Do not invalidate cache post set/remove xattrPoornima G2018-07-111-4/+52
| | | | | | | | | | | | | | | Since setxattr and removexattr fops cbk do not carry poststat, the stat cache was being invalidated in setxatr/remoxattr cbk. Hence the further lookup wouldn't be served from cache. To prevent this invalidation, md-cache is modified to get the poststat in set/removexattr_cbk in dict. Co-authored with Xavi Hernandez. Change-Id: I6b946be2d20b807e2578825743c25ba5927a60b4 fixes: bz#1586018 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Signed-off-by: Poornima G <pgurusid@redhat.com>
* performance/md-cache: Fix issue on lock being used before init.Zhang Huan2018-06-271-1/+2
| | | | | | | | | lock is used in mdc_xattr_list_populate(), but got init after call. Fix this issue by moving initing lock ahead. Change-Id: I94b08303a8ba74b1e9388f700587a00b7ae3fd78 fixes: bz#1595174 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* performance/quick-read: provide an invalidation based on ctimeRaghavendra G2018-06-182-1/+49
| | | | | | | | | | | | | | | | | | | | Quick-read by default uses mtime to identify changes to file data. However there are applications like rsync which explicitly set mtime making it unreliable for the purpose of identifying change in file content. Since ctime also changes when content of a file changes and it cannot be set explicitly, it becomes suitable for identifying staleness of cached data. This option makes quick-read to prefer ctime over mtime to validate its cache. However, using ctime can result in false positives as ctime changes with just attribute changes like permission without changes to file data. So, use this option only when mtime is not reliable. credits to Kotresh Hiremath Ravishankar <khiremat@redhat.com> for suggestion on using ctime instead of mtime. Change-Id: Ib3ae39a3252b2876c8ffe81f471d02a87190e9b9 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1591621
* performance/io-cache: fix a missing unlockVijay Bellur2018-05-311-1/+1
| | | | | | | | Fixes: bz789278 Change-Id: If8ca1fef8a10f1e7270390b61121f8a20a76b1d0 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* performance/open-behind: open pending fds before permission changeRaghavendra G2018-05-291-1/+60
| | | | | | | | | | | setattr, posix-acl and selinux changes on a file can revoke permission to open the file after permission changes. To prevent that, make sure the pending fd is opened before winding down setattr or setxattr (for posix-acl and selinux) calls. Change-Id: Ib0b91795d286072e445190f9a1b3b1e9cd363282 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> fixes: bz#1405147
* performance/read-ahead: throwaway read-ahead cache of all fds on writes on ↵Raghavendra G2018-05-291-28/+32
| | | | | | | | | | | | | | | | | | any fd This is to make sure applications that read and write on different fds of the same file work. This patch also fixes two other issues: 1. while iterating over the list of open fds on an inode, initialize tmp_file to 0 for each iteration before fd_ctx_get to make sure we don't carry over the history from previous iterations. 2. remove flushing of cache in flush and fsync as by themselves, they don't modify the data Change-Id: Ib9959eb73702a3ebbf90badccaa16b2608050eff Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* Revert "performance/write-behind: fix flush stuck by former failed writes"Raghavendra G2018-05-291-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 9340b3c7a6c8556d6f1d4046de0dbd1946a64963. operations/writes across different fds of the same file cannot be considered as independent. For eg., man 2 fsync states, <man 2 fsync> fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device </man> This means fsync is an operation on file and fd is just a way to reach file. So, it has to sync writes done on other fds too. Patch 9340b3c7a6c, prevents this. The problem fixed by patch 9340b3c7a6c - a flush on an fd is hung on a failed write (held in cache for retrying) on a different fd - is solved in this patch by making sure __wb_request_waiting_on considers failed writes on any fd as dependent on flush/fsync on any fd (not just the fd on which writes happened) opened on the same file. This means failed writes on any fd are either synced or thrown away on witnessing flush/fsync on any fd of the same file. Change-Id: Iee748cebb6d2a5b32f9328aff2b5b7cbf6c52c05 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* performance/quick-read: Use generation numbers to avoid updating the cache ↵Raghavendra G2018-05-282-27/+51
| | | | | | | | | | | | | | | | | | | | | with stale data Thanks to Pranith for the example. Following is the race we are trying to solve with this patch. 1) We have a file with content 'abc' 2) lookup and writev which replaces 'abc' with 'def' comes. Lookup fetches abc but yet to update the cache, and then immediately writev is wound which zeros out the cache. Now lookup_cbk updates the buffer with 'abc' even though on disk it is 'def'. Now writev completes and returns to application. 3) application does a readv which will be fetched from quick-read as 'abc'. Change-Id: I9a9cab9c99652aa6d17230a4fe4dc034ec502b1b BUG: 1390050 Updates: bz#1390050 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* build: Disallow unresolved symbol referencesPrashanth Pai2018-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | In the past, it was often[1] forgotten for xlators to be linked against the symbols they refer to. This often caused glusterd2 to fail while loading xlator's shared object (.so) file. This change adds "--no-undefined" as a linker flag which causes the linker to treat unresolved symbol references as an error and hence fail linking. [1]: https://review.gluster.org/#/c/19912/ https://review.gluster.org/#/c/19664/ https://review.gluster.org/#/c/19056/ https://review.gluster.org/#/c/17659/ https://bugzilla.redhat.com/show_bug.cgi?id=1532238 Bonus: Added cloudsync and utime xlator's generated source files to .gitignore Updates: bz#1193929 Change-Id: I9604a4a87b7313a5fa43bda5fdb37dfa7ef8facd Signed-off-by: Prashanth Pai <ppai@redhat.com>
* readdir-ahead: Fix an issue with parallel-readdir and readdir-optimizePoornima G2018-05-171-1/+1
| | | | | | | | | | | | | | | | Issue: When parallel-readdir is enabled, readdir-optimize automatically stops working because of a bug in rda_opendir. RCA: In rda_opendir, the xattrs that indicate readdir-optimize or not is sent in xdata. This xdata is sent to all the readdirp prefetch calls. A dict_ref is taken on xdata and kept in rda_opendir to be used by rda_fill_fd, but dht_opendir deletes some elements in xdata after calling rda_opendir. Hence dict_ref is not a right choice here, dict_copy needs to used. Change-Id: Ie7cc7ceb03117dd4179ef7905647f2f123f94966 fixes: bz#1578650 Signed-off-by: Poornima G <pgurusid@redhat.com>
* performance/md-cache: purge cache on ENOENT/ESTALE errorsRaghavendra G2018-04-251-87/+438
| | | | | | | | | | If not, next lookup could be served from cache and can be success, which is wrong. This can affect retry logic of VFS when it receives an ESTALE. Change-Id: Iad8e564d666aa4172823343f19a60c11e4416ef6 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1566303
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* xlators/performance: Add pass-through optionVarsha Rao2018-04-118-9/+103
| | | | | | | | | | Add pass-through option in performance traslators. Set the option in GF_OPTION_INIT() and GF_OPTION_RECONF() Updates: #304 Change-Id: If1537450147d154905831e36f7162a32866d7ad6 Signed-off-by: Varsha Rao <varao@redhat.com>
* quick-read: Provide statistics to the monitorPoornima G2018-03-282-26/+89
| | | | | | | Updates: #425 Change-Id: Iea5198821f4eabc46bc63529afa4a92d4b4c2be0 Signed-off-by: Poornima G <pgurusid@redhat.com>
* nl-cache: Provide statistics to the monitorPoornima G2018-03-241-9/+61
| | | | | | | Updates: #429 Change-Id: Ic2e64422055f1838d5d453643c739ef1e9319cfe Signed-off-by: Poornima G <pgurusid@redhat.com>
* md-cache: Provide statistics to the monitorPoornima G2018-03-241-9/+57
| | | | | | | Updates: #427 Change-Id: Ib1f45016ac75d7bc2755db0dd4b68ce1d95d26c3 Signed-off-by: Poornima G <pgurusid@redhat.com>
* nl-cache: Fix coverity issue RESOURCE_LEAKPoornima G2018-03-061-0/+3
| | | | | Change-Id: Ic552f31853e1886b8c76d45c8c66251f1fd6f97f Signed-off-by: Poornima G <pgurusid@redhat.com>
* nl-cache: Fix coverity issue RETURN_LOCALPoornima G2018-03-061-1/+1
| | | | | Change-Id: Ic6fbd34aad2a5ae5e27d833300bcd1284cb98c24 Signed-off-by: Poornima G <pgurusid@redhat.com>
* quick-read: Fix coverity issue CHECKED_RETURNPoornima G2018-03-051-2/+3
| | | | | Change-Id: I989e8fe28c86f67b7e54692c01ae3ed6e729aa16 Signed-off-by: Poornima G <pgurusid@redhat.com>
* io-cache: Fix coverity issue NEGATIVE_RETURNSPoornima G2018-03-051-1/+1
| | | | | Change-Id: I811225ad20e3bd9f05820212e6a843f05d96b246 Signed-off-by: Poornima G <pgurusid@redhat.com>
* md-cache: Fix coverity issue FORWARD_NULLPoornima G2018-03-021-3/+4
| | | | | Change-Id: I6ace846c412d898c0bc024b5d2081b11a223372f Signed-off-by: Poornima G <pgurusid@redhat.com>
* perfomance/io-threads: Add option to disable client disconnect featureVarsha Rao2018-02-282-1/+24
| | | | | | | | | | | | | > Add options to disable new features > Commit ID: c071992e8d > https://review.gluster.org/#/c/18291/ > By Michael Goulet <mgoulet@fb.com> This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: Ice477fdf4b8934f9fac0b4a2f6c93db97429a586 Signed-off-by: Varsha Rao <varao@redhat.com>
* io-cache: Fix coverity issuePoornima G2018-02-271-4/+2
| | | | | | | | | Coverity issue : FORWARD_NULL fd is assigned within a condition, but the fd is used even outside the condition. Change-Id: I6548d605d8a8acc6a25f1657f9fb75586d513042 Signed-off-by: Poornima G <pgurusid@redhat.com>
* glusterfsd: Memleak in glusterfsd process while brick mux is onMohit Agrawal2018-02-272-3/+10
| | | | | | | | | | | | | | | | | | Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/19574) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
* performance/io-threads: nuke everything from a client when it disconnectsVarsha Rao2018-02-271-2/+37
| | | | | | | | | | | | | > io-threads: nuke everything from a client when it disconnects > Commit ID: 4d8268d760 > https://review.gluster.org/#/c/18254/ > By Jeff Darcy <jdarcy@fb.com> This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: I13d3a74862eea3d01e8dbc8736987c3dae6e8b2a Signed-off-by: Varsha Rao <varao@redhat.com>
* write-behind: Make aggregate size configurablePoornima G2018-02-261-5/+21
| | | | | | | | | | | | | | | | | | | | | | | Currently the aggregate size is by default 128K (page size). From performance perspective small number of large writes is faster than large number of small writes, especially in EC volumes. But identifying the right aggregate size depends on multiple factors like the memcpy overhead, network overhead etc. On local machine, combining 128k writes to 1M writes for EC volumes yielded 30% improvement. As a part of this patch, aggregate size is just made configurable and page_size is modified accordingly. Raghavendra Gowdappa had suggested that, while aggregating writes we should get rid of memcpy of large write size, and instead add the pointer to existinf vector, will be doing it as a part of another patch. Also, in EC volumes, the vectors are merged into one vector, so even if we save memcopy in write_behind, EC would anyways do memcopy for merging vectors into one vector. Updates: #364 Change-Id: Ib67294b8577bea14dde1c84cd271012ecea99f09 Signed-off-by: Poornima G <pgurusid@redhat.com>
* md-cache: Modify options to be gd2 compatiblePoornima G2018-02-261-2/+28
| | | | | Change-Id: I79d51fee8ec5d2d237de7dd21c2d28c18cfd7ce8 Signed-off-by: Poornima G <pgurusid@redhat.com>
* nl-cache: Change the options to be gd2 compatiblePoornima G2018-02-261-0/+6
| | | | | Change-Id: Ib9d233df41b85c845643e3e6eb2d680e01859a43 Signed-off-by: Poornima G <pgurusid@redhat.com>