summaryrefslogtreecommitdiffstats
path: root/xlators/performance
Commit message (Collapse)AuthorAgeFilesLines
* io-threads: nuke everything from a client when it disconnectsJeff Darcy2017-09-091-3/+38
| | | | | | | | | | | | Summary: These requests haven't been issued, yet alone acknowledged. They would disappear if we crashed, which to the client is indistinguishable from any other kind of disconnection - if indeed the client itself isn't the one that died. So we're completely within our rights to discard these. There are strong hints that such "orphan" requests are part of how we get into the lock-revocation hangs we've been seeing for a while. Even if that theory doesn't pan out, there's no good reason to keep them around clogging up queues and so forth. This is a port of D5430057 & D5662545 to 3.8 Change-Id: Ie4c88f7791aac85540631f60f5c639497468ad76 Reviewed-on: https://review.gluster.org/18254 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* md-cache: Allow custom per-directory timeoutsShreyas Siravara2017-09-081-14/+62
| | | | | | | | | | | | | | | Summary: - This diff looks for a custom xattr on a directory or file called 'trusted.glusterfs.md-cache-timeout' and uses that timeout if it finds it instead of the default timeout value for the cache. - For example, if we know that a customer has a fixed set of directories that never change, we can set that attribute on all their directories and cache directory metadata for the lifetime of the client (NFS or FUSE) process. - Port of D5430395 to 3.8 Reviewed By: jdarcy Change-Id: Ieb232bc1365c59dd7c396c7a617f12973cc8ea01 Reviewed-on: https://review.gluster.org/18241 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* performance/io-threads: Add watchdog to cover up a possible thread leakJeff Darcy2017-09-082-16/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: There appears to be a thread leak somewhere, which causes io-threads to run out of threads to process a particular (priority-based) queue. The leak should obviously be fixed, but that might take a while and the consequences until then are severe - a brick essentially going offline without the courtesy of actually dying. This patch adds a watchdog that checks for stuck queues, and adds threads to break the deadlock. The same thing done manually on one afflicted cluster caused brick CPU usage to drop from 2600% to 400%, with latency quickly returning to normal. The controlling option is performance.iot-watchdog-secs, which is the number of seconds we'll watch for a non-empty queue with no items being dequeued. That's our signal to add a thread. A value of zero (the default) disables this watchdog feature. This is a port of D5177414 to 3.8. Test Plan: All the usual tests to determine safety. Use gdb to hack priv->queue_sizes to a non-zero value. This will make it look like the queue is non-empty, but since it does in fact have zero items there will be no dequeues. After watchdog-secs seconds, this should add a thread, with a corresponding entry in the brick log. Change-Id: Ic051e411d3e9351e1cf5e233bad8bbb5078cb259 Reviewed-on: https://review.gluster.org/18239 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* [io-cache] New volume options for read sizesJoshua Eilers2017-09-082-5/+26
| | | | | | | | | | | | | | | | | | | Summary: Two new volume options that control reads. performance.io-cache.read-size - Tells gluster how much it should try to read on each posix_readv call performance.io-cache.min-cached-read-size - Tells gluster the smallest files it should start caching, anything smaller is not cached This is a port of D4844662 to 3.8 Change-Id: I5ba891906f97e514e7365cc34374619379434766 Reviewed-on: https://review.gluster.org/18235 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* md-cache: Invalidate inode metadata on flushShreyas Siravara2017-09-071-0/+35
| | | | | | | | | | | | | | | | | | Summary: - When you write a file and then stat it immediately, md-cache returns stale stat information. - This diff implements flush() in md-cache so that we can correctly invalidate inodes after a write. - This is a port of D4762171 to 3.8 Reviewers: kvigor, dph Reviewed By: kvigor Change-Id: I368b7870d61b14a7e390917d195cbccc67029eb7 Reviewed-on: https://review.gluster.org/18233 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* md-cache: Cache statfs callsShreyas Siravara2017-09-071-0/+126
| | | | | | | | | | | | | | | | | | | | | | | | Summary: - This gives md-cache to cache statfs calls - You can turn it on or off via 'gluster vol set groot performance.md-cache-statfs <on|off>' - This is a port of D4652632 Test Plan: Tested functionality on devserver Reviewers: kvigor Reviewed By: kvigor Subscribers: #posix_storage Differential Revision: https://phabricator.intern.facebook.com/D4652632 Change-Id: I664579e3c19fb9a6cd9d7b3a0eae061f70f4def4 Signature: t1:4652632:1488581841:111cc01efe83c71f1e98d075abb10589c4574705 Reviewed-on: https://review.gluster.org/18228 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* Merge remote-tracking branch 'origin/release-3.8' into release-3.8-fbJeff Darcy2017-08-311-0/+18
|\ | | | | | | Change-Id: Ie35cd1c8c7808949ddf79b3189f1f8bf0ff70ed8
| * performance/read-ahead: prevent stale data being returned to application.Raghavendra G2017-05-111-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assume that fd is shared by two application threads/processes. T0 read is triggered from app-thread t1 and read call passes through write-behind. T1 app-thread t2 issues a write. The page on which read from t1 is waiting is marked stale T2 write-behind caches write and indicates to application as write complete. T3 app-thread t2 issues read to same region. Since, there is already a page for that region (created as part of read at T0), this read request waits on that page to be filled (though it is stale, which is a bug). T4 read (triggered at T0) completes from brick (with write still pending). Now both read requests from t1 and t2 are served this data (though data is stale from app-thread t2's perspective - which is a bug) T5 write is flushed to brick by write-behind. Fix is to not to serve data from a stale page, but instead initiate a fresh read to back-end. >Change-Id: Id6af733464fa41bb4e81fd29c7451c73d06453fb >BUG: 1414242 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: https://review.gluster.org/7447 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Csaba Henk <csaba@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> >Reviewed-by: Amar Tumballi <amarts@redhat.com> (cherry picked from commit 2ff39c5cbea6fbda0d7a442f55e6dc2a72efb171) Change-Id: Id6af733464fa41bb4e81fd29c7451c73d06453fb BUG: 1449314 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/17223 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* | io-stats: Expose io-thread queue depthsShreyas Siravara2017-08-302-8/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - This diff exposes the io-thread queue depths by sending a specialized getxattr() call down to the io-threads translator. - Port of D3086477, D3094145, D3095505 to 3.8 Test Plan: Tested on devserver, will run prove tests. Valgrind + ASAN pass as well. Reviewers: rwareing, kvigor Subscribers: dld, moox, dph Differential Revision: https://phabricator.fb.com/D3086477 Change-Id: Ia452a4fcdb9173a751c4cb48d739b25c235f6855 Reviewed-on: https://review.gluster.org/18143 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | [io-cache] cache statfsDavid Wolinsky2017-07-142-16/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: cache calls to statfs - io-cache must be enabled - then enable statfs caching - also can configure an independent cache time Test Plan: unit test basic/cache.t Reviewers: rwareing, sshreyas Subscribers: rappleye Differential Revision: https://phabricator.fb.com/D2524471 Change-Id: I55e0a773f9e24c2358d6fbbabbaf58bd5bd89ffc Tasks: 8618383 Reviewed-on: https://review.gluster.org/17771 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Fix merge error which broke all the thingsKevin Vigor2017-01-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fix merge error which broke all the things. Test Plan: prove tests/basic/accept_v4v6.t Reviewers: Subscribers: Tasks: Blame Revision: Change-Id: I65a3e1c60d7e63093c29e7357cc92cda385bce46 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16466 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | Merge remote-tracking branch 'origin/release-3.8' into merge-3.8Kevin Vigor2017-01-232-22/+70
|\| | | | | | | Change-Id: Ie6c73dee0b6798af4a69c43c0b03c3d02ff36aa2
| * performance/io-threads: Exit threads in fini() as wellPranith Kumar K2017-01-172-11/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: io-threads starts the thread in 'init()' but doesn't clean them up on 'fini()'. It relies on PARENT_DOWN to exit threads but there can be cases where event before PARENT_UP the graph init code can think of issuing fini(). This code path is hit when glfs_init() is called on a volume that is in 'stopped' state. It leads to a crash in ganesha process, because the io-thread tries to access freed memory. Fix: Ideal fix would be to wait for all fops in io-thread list to be completed on PARENT_DOWN, and have fini() do cleanup of threads. Because there is no proper documentation about how PARENT_DOWN/fini are supposed to be used, we are getting different kinds of sequences in different higher level protocols. So for now cleaning up in both PARENT_DOWN and fini(). Fuse doesn't call fini() gfapi is not calling PARENT_DOWN in some cases, so for now I don't see another way out. >BUG: 1396793 >Change-Id: I9c9154e7d57198dbaff0f30d3ffc25f6d8088aec >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15888 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit 25817a8c868b6c1b8149117f13e4216a99e453aa) BUG: 1412941 Change-Id: I5e36a7d253f2ef8abce507eced1eb7073cff930c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16397 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
| * performance/io-threads: Exit all threads on PARENT_DOWNPranith Kumar K2017-01-172-23/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When glfs_fini() is called on a volume where client.io-threads is enabled, fini() will free up iothread xl's private structure but there would be some threads that are sleeping which would wakeup after the timedwait completes leading to accessing already free'd memory. Fix: As part of parent-down, exit all sleeping threads. Please note that the upstream patch differs from this a little bit, because least-prio-throttling feature is removed from master, 3.9 >BUG: 1381830 >Change-Id: I0bb8d90241112c355fb22ee3fbfd7307f475b339 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15620 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit d7a5ca16911caca03cec1112d4be56a9cda2ee30) BUG: 1412941 Change-Id: I6341156251279b24ab2323cedf1b9722e42da671 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16396 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | write-behind: Allow trickling-writes to be configurable, fix usage of ↵Shreyas Siravara2016-12-141-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | page_size and window_size Summary: - It adds a configurable option for "trickling-writes". - Makes `__wb_preprocess_winds()` use `wb_inode->window_conf` rather than `page_size`, so that the window-size option is actually respected. - This is a port of D3576122 & D3738605 to 3.8. Test Plan: - Prove test which looks @ brick-level FOPs and ensures that they fall in the right write-size bucket. Reviewed By: rwareing Signature: t1:3576122:1468892648:6923a6a19b18888577ce5173b5c9cb9531f941e7 Change-Id: I379a9f2f0c4768c9052b7e9dd71c5f0469cb2d68 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16079 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | performance/md-cache: Add an option to cache all xattrs for an inodeShreyas Siravara2016-12-141-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - We add an option to cache all extended attributes for an inode - This is an option to bypass the whitelisted xattrs to cache Test Plan: - Prove tests Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ia52bed22aa8d84f953fe1d022df929674d716e9e Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16126 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | performance/io-threads: Reduce the number of timing calls in iot_workerMax Rijevski2016-12-091-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Reduce the amount of unnecessary timing calls in iot_worker servicing. - The current logic is unnecessarily accurate and hurts performance for many small FOPS. - This is a cherry-pick of D3156588 for 3.8 Test Plan: - Prove tests Change-Id: I6db4f1ad9a48d9d474bb251a2204969061021954 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16081 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* | performance/io-threads: Eliminate spinlock contention via fops-per-thread-ratioRichard Wareing2016-12-092-5/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Background: Frequently spinlock is observed on busy GFS clusters, which wastes CPU and destroys the performance of the cluster. Current solutions to this problem involve under-provisioning the thread pool, but this is problematic as during busy periods there may not be enough threads to service the queue. - This patch introduces a technique to avoid the stampeding herd problem with the io-threads workers. This is done by dynamically tuning the threads by a ratio of threads to queue depth, there-by keeping already running threads sufficiently busy by a tunable FOP to thread ratio. Ratio is controllable by the performanace.io-threads-fops-per-threads-ratio option. - More detailed reading on this approach can be found here: https://h21007.www2.hp.com/portal/download/files/unprot/hpux/MakingConditionVariablesPerform.pdf - Cherry-pick of D2530504 for 3.8 Test Plan: - Stress teston my dev server - shadow testing Reviewed By: moox, sshreyas Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I771ae783aa4ca5a6fd0449db64e07d1f4bff0d04 Reviewed-on: http://review.gluster.org/16080 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* | performance/md-cache: Fix caching for root inodeShreyas Siravara2016-12-081-4/+32
|/ | | | | | | | | | | | | | | | | | | | | | | Summary: - `is_mdc_key_satisfied()` is returning 0 when it has not checked any of the keys - This causes the cache'd value for the root inode to always be invalid (`mdc_xattr_satisfied()` returns 0, which causes us to jump to `uncached'). - In this diff we add a new option called "strict-xattrs", when enabled winds getxattr calls for those keys not present in our cache. - This allows "special" getxattr commands (quota cli commands for example) to work when md-cache is enabled. - This is a port of D4135452 Test Plan: - Test on devserver and see latency improvements for root inode. - Prove tests Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I8ff75595e821d7a714224b3b3dded23f0a93560a Reviewed-on: http://review.gluster.org/16060 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* performance/open-behind: Avoid deadlock in statedumpPranith Kumar K2016-11-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: open-behind is taking fd->lock then inode->lock where as statedump is taking inode->lock then fd->lock, so it is leading to deadlock In open-behind, following code exists: void ob_fd_free (ob_fd_t *ob_fd) { loc_wipe (&ob_fd->loc); <<--- this takes (inode->lock) ....... } int ob_wake_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, fd_t *fd_ret, dict_t *xdata) { ....... LOCK (&fd->lock); <<---- fd->lock { ....... __fd_ctx_del (fd, this, NULL); ob_fd_free (ob_fd); <<<--------------- } UNLOCK (&fd->lock); ....... } ================================================================= In statedump this code exists: inode_dump (inode_t *inode, char *prefix) { ....... ret = TRY_LOCK(&inode->lock); <<---- inode->lock ....... fd_ctx_dump (fd, prefix); <<<----- ....... } fd_ctx_dump (fd_t *fd, char *prefix) { ....... LOCK (&fd->lock); <<<------------------ this takes fd-lock { ....... } Fix: Make sure open-behind doesn't call ob_fd_free() inside fd->lock >BUG: 1393259 >Change-Id: I4abdcfc5216270fa1e2b43f7b73445f49e6d6e6e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15808 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Poornima G <pgurusid@redhat.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1393682 Change-Id: I45a0fbed683ef6acb7900df87534927f332fdaaa Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15818 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* md-cache: Invalidate cache entry for open() with O_TRUNCSoumya Koduri2016-11-031-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. This is backport of below mainline patch - http://review.gluster.org/#/c/15618/ > Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 > BUG: 1382266 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/15618 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > (cherry picked from commit 6ca5d6382f03685b31b12accb095093cf1486603) Change-Id: I92349f5b48aef07f3790db7aae25bfa2ddb5947e BUG: 1391450 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15771 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* performance/write-behind: fix flush stuck by former failed writesRyan Ding2016-11-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print *((wb_request_t *)0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. > Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab > BUG: 1372211 > Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1390838 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15762 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* performance/write-behind: remove the request from liability queue inRaghavendra G2016-10-171-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1379655 > Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb > Reviewed-on: http://review.gluster.org/15579 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit a8b2a981881221925bb5edfe7bb65b25ad855c04) Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1385620 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15658 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* performance/open-behind: Pass O_DIRECT flags for anon fd reads when requiredKrutika Dhananjay2016-09-252-39/+28
| | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15537 cherry-picked from a412a4f50d8ca2ae68dbfa93b80757889150ce99 Writes are already passing the correct flags at the time of open(). Also, make io-cache honor direct-io for anon-fds with O_DIRECT flag during reads. Change-Id: I9eb89c3bda34f9287861eb3b53c3d6a7b967c105 BUG: 1375959 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15552 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* performance/decompounder: Add graph for decompounder xlatorAshish Pandey2016-06-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | This xlator will fall below protocol/server. This is mandatory xlator without any options. Observed that the callback for decompounder translator was not added which was causing volume start to fail. Added cbks for decompounder. master- http://review.gluster.org/#/c/13968/ Change-Id: I3e16a566376338d9c6d36d6fbc7bf295fda9f3a6 BUG: 1346222 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14729 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* readdir-ahead: Prefetch xattrs needed by md-cachePrashanth Pai2016-05-103-7/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Negative cache feature implementation in md-cache requires xattrs returned by posix to be intercepted for every call that can possibly return xattrs. This includes readdirp(). This is crucial to treat missing keys in cache as a case of negative entry (returns ENODATA) md-cache puts names of xattrs that it wants to cache in xdata and passes it down to posix which returns the specified xattrs in the callback. This is done in lookup() and readdirp(). Hence, a xattr that is cached can be invalidated during readdirp_cbk too. This is based on the assumption that readdirp() will always return all xattrs that md-cache is interested in. However, this is not the case when readdirp() call is served from readdir-ahead's cache. readdir-ahead xlator will pre-fetch dentries during opendir_cbk and readdirp. These internal readdirp() calls made by readdir-ahead xlator does not set xdata in it's requests. Hence, no xattrs are fetched and stored in it's internal cache. This causes metadata loss in gluster-swift. md-cache returns ENODATA during getxattr() call even though the xattr for that object exists on the brick. On receiving ENODATA, gluster-swift will create new metadata and do setxattr(). This results in loss of information stored in existing xattr. Fix: During opendir, md-cache will communicate to readdir-ahead asking it to store the names of xattrs it's interested in so that readdir-ahead can fetch those in all subsequent internal readdirp() calls issued by it. This stored names of xattrs is invalidated/updated on the next real readdirp() call issued by application. This readdirp() call will have xdata set correctly by md-cache xlator. Cherry picked from commit 0c73e7050c4d30ace0c39cc9b9634e9c1b448cfb: > BUG: 1333023 > Change-Id: I32d46f93a99d4ec34c741f3c52b0646d141614f9 > Reviewed-on: http://review.gluster.org/14214 > Tested-by: Prashanth Pai <ppai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1334699 Change-Id: I32d46f93a99d4ec34c741f3c52b0646d141614f9 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/14281 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* performance/write-behind: guaranteed retry after a short writeRaghavendra G2016-05-041-17/+70
| | | | | | | | | | | | | | | | | | * Don't mark the request with a fake EIO after a short write. * retry the remaining buffer at least once before unwinding reply to application. This way we capture correct error from backend (ENOSPC, EDQUOT etc). Thanks to "Vijaikumar Mallikarjuna"<vmallika@redhat.com> for the test script. Change-Id: I73a18b39b661a7424db1a7855a980469a51da8f9 BUG: 1332789 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14197 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* io-threads: add setactivelk () fopSusant Palai2016-05-011-0/+1
| | | | | | | | | | | Change-Id: I74225a39348f6bb2fbdd1513676a70019227e45f BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14012 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* io-threads: add getactivelk () fopSusant Palai2016-05-011-0/+1
| | | | | | | | | | | Change-Id: Iaef97ea02dfc54e1a0f23ab972f58b0687b4709c BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13993 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* performance/decompounder: Introducing decompounder xlatorAnuradha Talur2016-04-257-1/+1085
| | | | | | | | | | | | | | This xlator decompounds the compound fops received, and executes them serially. Change-Id: Ieddcec3c2983dd9ca7919ba9d7ecaa5192a5f489 BUG: 1303829 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/13577 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* io-threads: Add lease() fopPoornima G2016-04-211-0/+1
| | | | | | | | | | | Change-Id: Ie4921867948d23b8b6c570196e88680cdb5ebfbc BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11599 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* md-cache: Cache gluster-swift metadataPrashanth Pai2016-03-161-0/+22
| | | | | | | | | | | | | BUG: 1317785 Change-Id: Ie02b8fc294802f8fdf49dee8bf97f1e6177d92bd Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/13735 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Poornima G <pgurusid@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* performance/open-behind: Fix fdctx dump NULL dereferencePranith Kumar K2016-02-081-3/+4
| | | | | | | | | | | | | Also printing flags correctly in statedump now Change-Id: Ibfdd74aab5643ecc47d0a88f109d5d1050685f5a BUG: 1294051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13076 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* performance/write-behind: fix memory corruptionRaghavendra G2016-01-131-9/+45
| | | | | | | | | | | | | | | | | | | | | 1. while handling short writes, __wb_fulfill_short_write would've freed current request before moving on to next in the list if request is not big enough to accomodate completed number of bytes. So, make sure to save next member before invoking __wb_fulfill_short_write on current request. Also handle the case where request is exactly the size of remaining completed number of bytes. 2. When write request is unwound because there is a conflicting failed liability, make sure its deleted from tempted list. Otherwise there will be two unwinds (one as part handling a failed request in wb_do_winds and another in wb_do_unwinds). Change-Id: Id1b54430796b19b956d24b8d99ab0cdd5140e4f6 BUG: 1297373 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/13213 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* performance/write-behind: maintain correct transit size in case ofRaghavendra G2016-01-061-15/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | short writes. 1. Imagine a write when cache is filled with failed syncs. 2. This write won't be unwound since cache size has exceeded configured limit. 3. With trickling-writes on by default, the last write request wont be considered for winding when there is non zero in-transit size. 4. There was a bug in accounting of in-transit size when winds resulted in short writes. Due to this bug, in-transit size used to be non-zero even when there are no syncs in progress. 5. Due to 3 and 4, current write request won't be wound till there is another write or fsync or flush from application. But application can't do any other fop till current write request is unwound. This resulted in deadlock and hence application would be hung in 'D' state. This patch fixes bug in accounting of in-transit size during short writes. Change-Id: I04ce8bb510efaaed7623cac38d69b32dbc3730ce Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1279730 Reviewed-on: http://review.gluster.org/13177 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* wb: remove inline keywordRaghavendra Talur2015-12-311-1/+1
| | | | | | | | | | | | | | | When compiled with -Werror flag gcc throws the following error: ‘iov_length’ is static but used in inline function ‘__wb_modify_write_request’ which is not static. Let gcc decide what functions to inline and remove the inline keyword. Change-Id: I6d832596eefcf08306634936e11d2c8d4b8f9ccd BUG: 1279730 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13113
* build: export minimum symbols from xlators for correct resolutionKaleb S KEITHLEY2015-12-229-9/+9
| | | | | | | | | | | | | | | | | | | | | | Revisiting http://review.gluster.org/#/c/11814/, which unintentionally introduced warnings from libtool about the xlator .so names. According to [1], the -module option must appear in the Makefile.am file(s); if -module is defined in a macro, e.g. in configure(.ac), then libtool will not recognize that this is a module and will emit a warning. [1] http://www.gnu.org/software/automake/manual/automake.html#Libtool-Modules Change-Id: Ifa5f9327d18d139597791c305aa10cc4410fb078 BUG: 1248669 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13003 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* performance/write-behind: retry "failed syncs to backend"Raghavendra G2015-12-221-104/+376
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. When sync fails, the cached-write is still preserved unless there is a flush/fsync waiting on it. 2. When a sync fails and there is a flush/fsync waiting on the cached-write, the cache is thrown away and no further retries will be made. In other words flush/fsync act as barriers for all the previous writes. The behaviour of fsync acting as a barrier is controlled by an option (see below for details). All previous writes are either successfully synced to backend or forgotten in case of an error. Without such barrier fop (especially flush which is issued prior to a close), we end up retrying for ever even after fd is closed. 3. If a fop is waiting on cached-write and syncing to backend fails, the waiting fop is failed. 4. sync failures when no fop is waiting are ignored and are not propagated to application. For eg., a. first attempt of sync of a cached-write w1 fails b. second attempt of sync of w1 succeeds If there are no fops dependent on w1 are issued b/w a and b, application won't know about failure encountered in a. 5. The effect of repeated sync failures is that, there will be no cache for future writes and they cannot be written behind. fsync as a barrier and resync of cached writes post fsync failure: ================================================================== Whether to keep retrying failed syncs post fsync is controlled by an option "resync-failed-syncs-after-fsync". By default, this option is set to "off". If sync of "cached-writes issued before fsync" (to backend) fails, this option configures whether to retry syncing them after fsync or forget them. If set to on, cached-writes are retried till a "flush" fop (or a successful sync) on sync failures. fsync itself is failed irrespective of the value of this option, when there is a sync failure of any cached-writes issued before fsync. Change-Id: I6097c9257bfb9ee5b15616fbe6a0576ae9af369a Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1279730 Reviewed-on: http://review.gluster.org/12594
* md-cache: Remove readdirp fop for md-cacheMohammed Rafi KC2015-11-091-1/+4
| | | | | | | | | | | | | | | | readdirp call will return inode for each entry and will share this nodeid with kernal, also md-cache will cache this gfid and base name. So when a lookup operation is perfromed on such an inode, md-cache will wind the call, that prevents populating inode ctx for other lower layer xlators. Change-Id: I43c768703a3cc66d05b1c32909d1a2781001cb49 BUG: 1236032 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11894 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* core: fix Ubuntu code audit (cppcheck) resultsKaleb S. KEITHLEY2015-11-011-1/+1
| | | | | | | | | | | | | | | | | | | This change includes an additional fix (forward port) of a fix made on the release-3.x branches to address a comment made after the original change was merged on the master branch. * release-3.7 * Change-Id: Ie15c5919e5bf9b0a1c66e20dc42d80fdfa8bd7f4 * BZ: 1227808 * http://review.gluster.org/11069 Change-Id: I4fc2672ab1a17998b2e40bc43eb6a3e15058a086 BUG: 1109180 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/11067 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* build: export minimum symbols from xlators for correct resolutionKaleb S. KEITHLEY2015-09-249-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've been lucky that we haven't had any symbol collisions until now. Now we have a collision between the snapview-client's svc_lookup() and libntirpc's svc_lookup() with nfs-ganesha's FSAL_GLUSTER and libgfapi. As a short term solution all the snapview-client's FOP methods were changed to static scope. See http://review.gluster.org/11805. This works in snapview-client because all the FOP methods are defined in a single source file. This solution doesn't work for other xlators with FOP methods defined in multiple source files. To address this we link with libtool's '-export-symbols $symbol-file' (a wrapper around `ld --version-script ...` --- on linux anyway) and only export the minimum required symbols from the xlator sharedlib. N.B. the libtool man page says that the symbol file should be named foo.sym, thus the rename of *.exports to *.sym. While foo.exports worked, we will follow the documentation. Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> BUG: 1248669 Change-Id: I1de68b3e3be58ae690d8bfb2168bfc019983627c Reviewed-on: http://review.gluster.org/11814 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* all: reduce "inline" usageJeff Darcy2015-09-011-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* performance translators : porting the missing gf_log to gf_msgHari Gowtham2015-08-318-11/+71
| | | | | | | | | Change-Id: I5cc2b4669b164fe09637c86da05d2d94589dd7e4 BUG: 1253149 Signed-off-by: Hari Gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/11906 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* feature/performace: Fix broken buildKotresh HR2015-06-289-8/+10
| | | | | | | | | | | | Fix the build broken because of patch http://review.gluster.org/#/c/9822/ Change-Id: I0ee502c0fad5be87186c80ab4729036f52f85fa3 BUG: 1194640 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11451 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
* features/bit-rot-stub: deny access to bad objectsRaghavendra Bhat2015-06-271-0/+5
| | | | | | | | | | | | | | | | | * Access to bad objects (especially operations such as open, readv, writev) should be denied to prevent applications from getting wrong data. * Do not allow anyone apart from scrubber to set bad object xattr. * Do not allow bad object xattr to be removed. Change-Id: Ia9185a067233a9f26e3d41d41d11d9a4eb0da827 BUG: 1210689 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/11126 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* Logging: Porting the performance translatorarao2015-06-2722-269/+1267
| | | | | | | | | | | | | logs to new logging framework. Change-Id: Ie6aaf8d30bd4457bb73c48e23e6b1dea27598644 BUG: 1194640 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9822 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-2914-70/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* quick-read: Do a null check before unrefRavishankar N2015-04-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4ea5b8d2046b9e0bc7f24cdf1b2e72ab8b462c9e seems to have removed the check as a part of static analyis fixes but I'm seeing errors in the client log. -------------------- touch /mnt/fuse_mnt/zero-byte-file echo 3 > /proc/sys/vm/drop_caches cat /mnt/fuse_mnt/zero-byte-file mount log: [2015-04-13 05:52:21.683256] E [iobuf.c:790:iobuf_unref] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x232)[0x7feda12c0e24] (--> /usr/local/lib/libglusterfs.so.0(iobuf_unref+0x56)[0x7feda1304c8e] (--> /usr/local/lib/glusterfs/3.7dev/xlator/performance/quick-read.so(qr_readv_cached+0x466)[0x7fed95b7e2fc] (--> /usr/local/lib/glusterfs/3.7dev/xlator/performance/quick-read.so(qr_readv+0x70)[0x7fed95b7e385] (--> /usr/local/lib/libglusterfs.so.0(default_readv_resume+0x270)[0x7feda12d4401] ))))) 0-iobuf: invalid argument: iobuf -------------------- Hence re-adding the checks. Note: I'm using the same BZ Id used for the original commit though it is in MODIFIED state just for correlation. Change-Id: I79749814a9d4082933e3b306ce449492ee5b43a5 BUG: 1109180 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10206 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* performance/md-cache: set right error check in {f}getxattr_cbk()Vijay Bellur2015-04-061-2/+2
| | | | | | | | | | | | | | | Currently mdc_{f}getxattr_cbk() check(s) for a non-zero value to determine if any cache update has to be performed. Right from posix xlator, op_ret has a positive value upon success and -1 upon failure. This patch sets right the check in getxattr callbacks so that xattr cache update happens for successful calls. Change-Id: Ifa5ec38bdf7e3dc095de9a56d91559b13cd9e8b6 BUG: 1208784 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-042-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>