summaryrefslogtreecommitdiffstats
path: root/xlators/performance/write-behind/src/write-behind.c
Commit message (Collapse)AuthorAgeFilesLines
* build: MacOSX Porting fixesHarshavardhana2014-04-241-2/+2
| | | | | | | | | | | | | | | | | | | | | git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs Working functionality on MacOSX - GlusterD (management daemon) - GlusterCLI (management cli) - GlusterFS FUSE (using OSXFUSE) - GlusterNFS (without NLM - issues with rpc.statd) Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Dennis Schafroth <dennis@schafroth.com> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Dennis Schafroth <dennis@schafroth.com> Reviewed-on: http://review.gluster.org/7503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* write-behind: track filesize when doing extending writesNiels de Vos2014-02-271-4/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A program that calls mmap() on a newly created sparse file, may receive a SIGBUS signal. If SIGBUS is not handled, a segmentation fault will occur and the program will exit. A bug in the write-behind translator can cause the creation of a sparse file created with open(), seek(), write() to be cached. The last write() may not be sent to the server, until write-behind deems this necessary. * open(.., O_TRUNC, ...)/creat() the file, it is 0 bytes big * seek() into the file, use offset 31 * write() 1 byte to the file * the range from byte 0-30 are unwritten so called 'sparse' The following illustration tries to capture this: Legend: [ = start of file _ = unallocated/unwritten bytes # = allocated bytes in the file ] = end of file [_______________#] | | '- byte 0 '- byte 31 Without this change, reading from byte 0-30 will return an error, and reading the same area through an mmap()'d pointer will trigger a SIGBUS. Reading from this range did not trigger the outstanding write() to be flushed. The brick that receives the read() (translated over the network from mmap()) does not know that the file has been extended, and returns -EINVAL. This error gets transported back from the brick to the glusterfs-fuse client, and translated by the Linux kernel/VFS into SIGBUS triggered by mmap(). In order to solve this, a new attribute to the wb_inode structure is introduced; the current size of the file. All FOPs that can modify the size, are expected to update wb_inode->size. This makes it possible for extending writes with an offset bigger than EOF to mark the unwritten area as modified/pending. Change-Id: If5ba6646732e6be26568541ea9b12852a5d0b988 BUG: 1058663 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/6835 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Fix for 'use after free' errors reported by coverity.Poornima2014-02-051-2/+2
| | | | | | | | | | Change-Id: I941fc89b2d696c7f227330321ed4bba3ed1deac4 BUG: 789278 Signed-off-by: Poornima <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/6868 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* write-behind: handle iobref_merge() error gracefullyAnand Avati2013-11-261-2/+3
| | | | | | | | | | | .. by UNWINDing ENOMEM error, rather than crashing. Change-Id: Ica2d6399eaf7e381e7ebc41155620559c139c4d3 BUG: 1034398 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6349 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com>
* performance/write-behind: invoke request queue processing ifRaghavendra G2013-08-141-19/+30
| | | | | | | | | | | | | | | | | | | | | | we find fd marked bad while trying to fulfill lies. * flush was queued behind some unfulfilled write. * A previously wound write returned an error and hence fd was marked bad with corresponding error. * wb_fulfill_head (invocation probably rooted in wb_flush), before winding checks for failures of previous writes and since there was a failure, calls wb_head_done without even winding one request in head. * wb_head_done unrefs all the requests in list "head". * since flush was last operation on fd (and most likely last operation on inode itself), no one invokes wb_process_queue and flush is stuck in request queue for eternity. Change-Id: I3b5b114a1c401d477dd7ff64fb6119b43fda2d18 BUG: 988642 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* write-behind: preserve error returned as-isAmar Tumballi2013-07-241-5/+1
| | | | | | | | | Change-Id: Ib766403774c1323e0bbddafedeaa47e7fa3a59fa Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 987415 Reviewed-on: http://review.gluster.org/5296 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: Enable write-behind when strict_O_DIRECT is not set.Vijay Bellur2013-05-281-2/+1
| | | | | | | | | | | | | When open() with O_DIRECT happens, write-behind was being disabled for the fd irrespective of strict_O_DIRECT option. This commit disables write-behind only when strict_O_DIRECT is enabled. Change-Id: Ieef180e52910c3bf64d46b26b0e5dc3b8542f6d2 BUG: 923556 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4697 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* mount/fuse: enable fuse real async dio when availableBrian Foster2013-05-151-0/+5
| | | | | | | | | | | | | | | | | | | | | fuse has support for optimized async. direct I/O handling via the FUSE_ASYNC_DIO init flag. Enable FUSE_ASYNC_DIO when advertised by fuse. performance/write-behind: fix dio hang Also fix a hang observed during aio-stress testing due to conflicting request handling in write-behind. Overlapping requests are skipped in pick_winds and may never continue when the conflicting write in progress returns. Add a wb_process_queue() call after a non-wb request completes to keep the queue moving. BUG: 963258 Change-Id: Ifba6e8aba7a7790b288a32067706b75f263105d4 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5014 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/write-behind: guarantee non-overlapping concurrent writesAnand Avati2013-02-201-1/+65
| | | | | | | | | | | | | | | | | Maintain a list of writes (either written behind or SYNC) which are currently "in progress" (i.e, STACK_WIND'ed towards server) and hold off any new STACK_WIND of write (either written behind or SYNC) which overlaps with any of the "in progress" writes. This is a guarantee which AFR's eager-lock depends upon (though not strictly a write-behind requirement) Change-Id: Icedd0b51b440366a906dc9223d62b7fd6ef2ca03 BUG: 857673 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4551 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <raghavendra@gluster.com>
* call-stub: internal refactorAnand Avati2013-02-191-40/+40
| | | | | | | | | | | | | | | | | - re-structure members of call_stub_t with new simpler layout - easier to inspect call_stub_t contents in gdb now - fix a bunch of double unrefs and double frees in cbk stub - change all STACK_UNWIND to STACK_UNWIND_STRICT and thereby fixed a lot of bad params - implement new API call_unwind_error() which can even be called on fop_XXX_stub(), and not necessarily fop_XXX_cbk_stub() Change-Id: Idf979f14d46256af0afb9658915cc79de157b2d7 BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4520 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* performance/write-behind: mark fd bad if any written behind writes fail.Raghavendra G2013-02-171-57/+114
| | | | | | | | | Change-Id: I515fc26c61e1ea929a3049b3001c58a64f3e6c87 BUG: 765473 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4515 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: do not try to take LOCK in forgetRaghavendra Bhat2013-02-031-7/+3
| | | | | | | | | | | LOCK attempt in wb_forget is unnecessary Change-Id: Ibdedc23d0c829c34aedd6fc5bc0e0a584b832514 BUG: 903566 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4423 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/cli: Updated the options descriptions for "volume set help"Avra Sengupta2013-01-211-0/+2
| | | | | | | | | Change-Id: I0db00b7334bb9707ab48bd661ac03a3ad818d6e4 BUG: 893458 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4393 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* write-behind: fixes issues with iobuf length for large writesKaushal M2012-12-131-5/+15
| | | | | | | | | | | | | | | | Use of an unsigned type in some calculations of size would lead to segmentation faults, if several large adjacent writes came in concurrently. Also, improves buffer allocation code to take the size required into account. Credits for the patch go to Amar. Change-Id: I8a09c52d49909e4ee8e7d4dcfa02ec33ea36a551 BUG: 880948 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4307 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fix memory leaksRaghavendra Bhat2012-12-041-0/+2
| | | | | | | | | | | | | * write-behind: free the inode context in wb_forget * distribute: in readdirp callback put the allocated context to the inode * distribute: check if the layout is NULL before accessing it in layout_unref Change-Id: I7698f81b85b99d06bf6b01fc1a6e51e1593b5e27 BUG: 790709 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4250 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* write-behind: use uint64_t for overlap comparisonAnand Avati2012-10-131-4/+4
| | | | | | | | | off_t is 'long int' (signed word) and therefore ULLONG_MAX - 1 Change-Id: I027de7a1b2ca24865d5d787f9986930e97911ca4 BUG: 857673 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4079
* performance/write-behind: use try lock while taking statedumpsRaghavendra Bhat2012-10-111-2/+9
| | | | | | | | | | Change-Id: I690e8bf650d6e6e50899c2e17a79f42789e701eb BUG: 843792 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4036 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* write-behind: implement causal ordering and other cleanupAnand Avati2012-10-011-2418/+1068
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rules of causal ordering implemented: - If request A arrives after the acknowledgement (to the app, i.e, STACK_UNWIND) of another request B, then request B is said to have 'caused' request A. - (corollary) Two requests, which at any point of time, are unacknowledged simultaneously in the system can never 'cause' each other (wb_inode->gen is based on this) - If request A is caused by request B, AND request A's region has an overlap with request B's region, then then the fulfillment of request A is guaranteed to happen after the fulfillment of B. - FD of origin is not considered for the determination of causal ordering. - Append operation's region is considered the whole file. Other cleanup: - wb_file_t not required any more. - wb_local_t not required any more. - O_RDONLY fd's operations now go through the queue to make sure writes in the requested region get fulfilled before getting processed. - O_SYNC fd's operations now go through the queue to make sure previously acknowledged writes on the file (via other fds) are fulfilled before getting processed. - Option to not honor O_SYNC is now removed. - Option to ignore O_DIRECT is added (useful when running a VM and the drive appears with NCQ/TCQ or WCE=1 for the guest.) - Option to disable_first_nbytes is removed (as the cause of the bug which required this was diagnosed to be missing TCP_NODELAY.) - General cleanup and better conformance to coding style and convention. Change-Id: Ib44fb72da3727246b4a85174cb568c2f0231f6de BUG: 857673 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/3947 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* performance/write-behind: avoid deadlock while taking the statedump of fdsRaghavendra Bhat2012-08-191-6/+24
| | | | | | | | | | | * Provide a hook for forget Change-Id: Ide7ea6d4c6a7d0d93b81570cb544f2bbda526eeb BUG: 846916 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/3795 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: store the wb_inode in local before windingRaghavendra Bhat2012-08-111-1/+3
| | | | | | | | | | | | | | | * Store the write-behind's inode context in the local structure before winding the call so that in callback inode context is found. * Before returning EBADFD check if the inode context (wb_inode) is NULL, along with the inode type. Change-Id: If5a1c667efe6882a6efef1439cee3effc32ff9a7 BUG: 846536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3796 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: reduce the usage of global variablesAmar Tumballi2012-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * move all the 'logging' related global variables into ctx * make gf_fop_list a 'const' global array, hence no init(), no edits. * make sure ctx is allocated without any dependancy on memory-accounting infrastructure, so it can be the first one to get allocated * globals_init() should happen with ctx as argument not yet fixed below in this patchset: * anything with 'THIS' related globals * anything related to compat_errno related globals as its one time init'd and not changed later on. * statedump related globals Change-Id: Iab8fc30d4bfdbded6741d66ff1ed670fdc7b7ad2 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 764890 Reviewed-on: http://review.gluster.com/3767 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: maintain a per-inode request queue instead of ↵Raghavendra G2012-08-021-596/+768
| | | | | | | | | | | | | | | | maintaining per-fd path based operations like stat etc, whose results will be affected by writes have to be ordered with writes. With request queues maintained in inode this can be done naturally, than when they are maintained per open fd. Change-Id: Ibdde3b81366f642d07531632fc9062cb44fad2e7 BUG: 765443 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/712 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: detect short writes and pend an EIO errorBrian Foster2012-07-251-1/+15
| | | | | | | | | | | | | | | Write-behind returns write requests immediately and queues the request in memory for merging, etc. If a write is incomplete, pend an EIO error for the next fop. This ensures that write failures are not silent and potentially ignored. BUG: 809975 Change-Id: I0e0e6c8e710efab58ccfaf746501d00e459eb7ef Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.com/3712 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/write-behind: preserve lk-owner while syncing writes.Raghavendra G2012-07-251-66/+143
| | | | | | | | | | | | | - This patch also makes syncing of non-overlapping but consecutive writes parallel. Till now only contiguous writes were synced parallely. Change-Id: Icf0d5ea373f30c79fcdc90ba44b7e7a1bc5f0111 BUG: 765141 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* remove useless if-before-free (and free-like) functionsJim Meyering2012-07-131-5/+2
| | | | | | | | | | | | See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* performance/write-behind: queue setattr fop with writes always.Raghavendra G2012-03-271-8/+1
| | | | | | | | | | | | | | stat returned in setattr_cbk can be cached by the kernel. Hence it is always necessary that we return correct stat, which implies that setattr should not be out of order with respect to write fops. Change-Id: I305feeb4802f8a41ffaf032100832cbd65dfc5c1 BUG: 765443 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/3011 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: adding extra data for fopsAmar Tumballi2012-03-221-98/+109
| | | | | | | | | | | | | with this change, the xlator APIs will have a dictionary as extra argument, which is passed between all the layers. This can be utilized for overloading in some of the operations. Change-Id: I58a8186b3ef647650280e63f3e5e9b9de7827b40 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2960 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mempool: adjustments in pool sizesAmar Tumballi2012-02-221-1/+1
| | | | | | | | | | | | | | | | | * while creating 'rpc_clnt', the caller knows what would be the ideal load on it, so an extra argument to set some pool sizes * while creating 'rpcsvc', the caller knows what would be the ideal load of it, so an extra argument to set request pool size * cli memory footprint is reduced Change-Id: Ie245216525b450e3373ef55b654b4cd30741347f Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2784 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-211-16/+20
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Honor O_SYNC etc. properly.Jeff Darcy2012-02-211-2/+28
| | | | | | | | | Change-Id: I3dd90fe230386ad5571c5e639f27460e3d003f0e BUG: 773100 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/2626 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* iobuf: use 'iobuf_get2()' to get variable sized buffersAmar Tumballi2012-02-201-0/+1
| | | | | | | | | | | added 'TODO' in places where it is missing. Change-Id: Ia802c94e3bb76930f7c88c990f078525be5459f5 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765264 Reviewed-on: http://review.gluster.com/388 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: add an extra flag to readv()/writev() APIAmar Tumballi2012-02-141-8/+10
| | | | | | | | | | | | needed to implement a proper handling of open flag alterations using fcntl() on fd. Change-Id: Ic280d5db6f1dc0418d5c439abb8db1d3ac21ced0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2723 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-201-55/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* glusterfs: An effort to fix all the spell mistakes and typoHarshavardhana2011-11-161-4/+4
| | | | | | | | | | | | | | | in the entire glusterfs codebase. This patch fixes many of spell mistakes and typo in the entire glusterfs codebase and all supported modules. Change-Id: I83238a41aa08118df3cf4d1d605505dd3cda35a1 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* statedump: do not print the inode number in the statedumpRaghavendra Bhat2011-10-011-58/+28
| | | | | | | | | | | | | Since gfid is used to uniquely identify a inode, in the statedump printing inode number is not necessary. Its suffecient if the gfid of the inode is printed. And do not print the the inodelks, entrylks and posixlks if the lock count is 0. Change-Id: Idac115fbce3a5684a0f02f8f5f20b194df8fb27f BUG: 3476 Reviewed-on: http://review.gluster.com/530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-071-3/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* option validation: further fixesAnand Avati2011-08-191-1/+1
| | | | | | | | | | fixes in option handling changes Change-Id: I0a44cdb088e3f08cd43d583a580736d0903fa88c BUG: 3415 Reviewed-on: http://review.gluster.com/261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* xlator options: revamp xlator option validation/reconfigure codeAnand Avati2011-08-181-181/+21
| | | | | | | | | | | | | | | | | - move option handling to options.c (new file) - remove duplication of option validation code - remove duplication of gf_log / sprintf - get rid of xlator_t->validate_options - get rid of option validation in rpc-transport - get rid of validate_options() in every xlator - use xlator_volume_option_get to clean up many functions - introduce primitives to init/reconfigure option types Change-Id: I51798af72c8dc0a2b9e017424036eb3667dfc7ff BUG: 3415 Reviewed-on: http://review.gluster.com/235 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* write-behind: changes in volume_options to assist volume set help/help-xmlKaushik BV2011-07-121-4/+43
| | | | | | | | Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2041 (volume set help option) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2041
* performance-translators: print path to file while dumping fdctx.Raghavendra G2011-06-141-0/+9
| | | | | | | | Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 1059 (enhancements for getting statistics from performance translators) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1059
* performance/write-behind: initialize lock in wb-file before wb-file is set ↵Raghavendra G2011-04-291-20/+26
| | | | | | | | | | | | | | | | | | | in fd-ctx. - Consider a combination of fuse->quick-read->read-ahead->wb->client. quick-read can do open-behind (open is returned as success even before it is issued to backend) and hence the fd can already be in the list of open fds of the inode. A flush call on some other fd opened on the same inode, will result in ra_flush issuing flush calls to all the fds opened on the same inode. This can result in wb_flush trying to hold a lock on non-initialized lock there by causing memory corruption. Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2679 (Crash in GlusterFS 3.0.5 in GSP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2679
* CLI : Validate options farmework.Gaurav2011-03-231-56/+22
| | | | | | | | Signed-off-by: Gaurav <gaurav@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2064 (NFS options are removed upon glusterd restart) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2064
* performance/write-behind: logging enhancementsRaghavendra G2011-03-171-262/+453
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* performance/write-behind: whitespace cleanup.Raghavendra G2011-03-171-492/+427
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* performance/write-behind: fix warnings due to format string mismatches ↵Raghavendra G2010-12-121-11/+11
| | | | | | | | | | during invocation of gf_log. Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2211 ((re)introduce warnings for format string/parameter mismatch) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2211
* Change assert to GF_ASSERTVijay Bellur2010-10-121-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971