summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht/src/dht.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterfs: zerofill supportM. Mohan Kumar2013-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: discard (hole punch) supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gluster: add fallocate fop supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* dht: make nufa/switch call dht's init/finiJeff Darcy2013-04-031-726/+7
| | | | | | | | | | | | | These functions keep changing as new functionality is added, so copying and pasting the code is not a good solution. This way ensures that all fields get initialized properly no matter how much new stuff we throw in. Change-Id: I9e9b043d2d305d31e80cf5689465555b70312756 BUG: 924488 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4710 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* dht: make DHT xattr names configurableJeff Darcy2013-03-211-0/+19
| | | | | | | | | | | | | | | | | | | | This is necessary to support "DHT over DHT" configurations, so that the upper and lower instances of DHT don't step all over each other. Why would we even consider such a thing? Because it gives us the ability to do data tiering and rack-aware placement, either by themselves or as complements to other functionality such as erasure codes or deduplication which save space but cost performance. By setting up the top-level DHT to place data into one of several lower-level DHT pools based on policy instead of pure elastic hashing, we get better performance for 90% of accesses and better storage efficiency for 90% of data, all for relatively low effort. Change-Id: I72e65c29edfc80babf39f7a2a00090f4588c4070 BUG: 924265 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4694 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Added the validation function for subvols-per-directoryAvra Sengupta2013-02-281-0/+2
| | | | | | | | | Change-Id: Ie2259023b9001311a2032792639c3093054f6750 BUG: 896431 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4552 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/distribute: Add filter to support file patterns to be migratedshishir gowda2013-02-271-0/+80
| | | | | | | | | | | | | | | | | | | | | | | 'gluster volume rebalance' command will be enhanced to support passing of these options/pattern. <pattern> is comma separated list as show below. The Precedence is from right to left. e.g- "*avi,*pdf:10MB,*:1KB" The precedence is as follows: migrate all files with size equal or greater than 1KB "*:1KB" migrate all pdf files with size equal or greater than 10MB "*pdf:10MB" migrate all avi files "*avi" With this option, it is possible to choose which files to migrate. Change-Id: I6d6d6a015bcbacf1debae2f278a2d92306fb055d BUG: 896456 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4366 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* distribute: add hash-name-regex optionJeff Darcy2013-02-181-1/+57
| | | | | | | | | | | | | | | | | | | | | This is to support the common "write to temp file then rename" idiom. In our case this causes us to create a linkfile during the rename in (N-1)/N cases, with a significant impact on performance e.g. for UFO which uses this idiom. This patch allows the user to specify up to two regular expressions that separate the permanent and transient parts of a temp-file name: rsync-hash-regex reimplements the existing RSYNC_FRIENDLY_NAME pattern where the temporary name for EXAMPLE is .EXAMPLE.suffix extra-hash-regex can be used for a second pattern, active concurrently, and can include alternatives via the POSIX extended-regex syntax Change-Id: Ic1a6ed19324bc31fefe91ee34b8478877a9c5d2c BUG: 912564 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4116 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd/cli: Updated the options descriptions for "volume set help"Avra Sengupta2013-01-211-4/+24
| | | | | | | | | Change-Id: I0db00b7334bb9707ab48bd661ac03a3ad818d6e4 BUG: 893458 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4393 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/distribute: re-set layouts to prevent overlapsshishir gowda2012-12-111-0/+1
| | | | | | | | | | | | | | | | | When subvols-per-directory option is used, with bricks addition/removal the layouts might get distributed to other subvols, which were not part of the layout before. We need to clean up layouts on old subvolumes, to prevent overlaps. Also, we need to make sure if layout-cnt is never less than subvolume-cnt. Change-Id: I00994a092ca0c99aedcc41bd9412d43460f88a04 BUG: 884455 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4281 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* libglusterfs: Implement float percentagePranith Kumar K2012-11-231-3/+3
| | | | | | | | | Change-Id: Ia7ea63471f0bbd74686873f5f6f183475880f1a0 BUG: 839595 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/4162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: dump the layout information of directories onlyRaghavendra Bhat2012-11-191-9/+18
| | | | | | | | | | | | | | | testcase: The changes are for removing gf_log from statedump related sections in dht and using pthread_mutex_trylock in statedump sections. Changes are internal. So tests were done by attaching gdb to the process and executing by manually changing the values of some of the pointers. Change-Id: I41fa76c1812b462cb76f5bbf2fd14de080e73895 BUG: 843822 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4117 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/distribute: Always return the latest time in struct iatt.shishir gowda2012-10-161-8/+2
| | | | | | | | | | | | | | | | | | | save the a/c/mtime in inode_ctx, and dht_inode_ctx_update checks the passed iatte, and updates the stat's time, and inode_ctx's time accordingly. For preparent times, only the iatt stat to be returned is updated, not the ctx. With this, update, WIPE is removed, as we would always be passing back the latest mtime, and hence cache times will be relevant. TODO-handle rename WIPE calls Change-Id: I8e4c738cd830f3fafeef789c9181f9c242ac96a2 BUG: 857791 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/3737 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: handle percent option for 'min-free-disk'Amar Tumballi2012-09-071-0/+11
| | | | | | | | | | | | | | | * with the init option cleanups, setting of 'conf->disk_unit' was reset, which made it not set the '%' in the option. * bring a global check, which makes the option assume its percent, as long as value is < 100. Change-Id: I00bd1395a309cdc596a2b2b80304c6d98696a24a Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 852889 Reviewed-on: http://review.gluster.org/3918 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/distribute: remove gf_log() from statedump functionsAmar Tumballi2012-09-061-3/+0
| | | | | | | | | Change-Id: I83cccab6819d6a74e96c2717ca539fa1568cac89 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 843822 Reviewed-on: http://review.gluster.org/3912 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Optimize readdirp calls in DHTshishir gowda2012-08-131-0/+8
| | | | | | | | | | | | | | | | | Bring in option which is supported by posix xlator to filter out directory's entries from being returned. DHT would now request non-first subvols to filter out directory entries. dht xlator-option readdir-optimize will enable this optimization Change-Id: I35224bc81c9657f54f952efac02790276c35ded5 BUG: 838199 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.com/3772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* remove useless if-before-free (and free-like) functionsJim Meyering2012-07-131-14/+7
| | | | | | | | | | | | See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: coverity fixes (mostly resource leak fixes)Amar Tumballi2012-06-051-3/+5
| | | | | | | | | | | currently working on obvious resource leak reports in coverity Change-Id: I261f4c578987b16da399ab5a504ad0fda0b176b1 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 789278 Reviewed-on: http://review.gluster.com/3265 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd/rebalance: Display run-time of rebalance processshishir gowda2012-05-241-1/+16
| | | | | | | | | | | | | | | | | Added run-time value field to cli output of rebalance/remove-brick. A new cluster/distribute boolean option rebalance-stats when set to ON, time taken for migration of each file is logged. With rebalance-stats OFF (default), rebalance logs will only have entries showing time spent in each directory. Change-Id: I02a8918621120068cd71ffaf2999d30b3a2d10a2 BUG: 821987 Signed-off-by: shishir gowda <shishirng@gluster.com> Reviewed-on: http://review.gluster.com/3303 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* distribute: use global synctask 'env' instead of localAmar Tumballi2012-05-231-9/+0
| | | | | | | | | | | | | | creating a local synctask_env can lead to creating of many more syncop threads than required. The current syncop logic can handle the scale-up/scale-down of threads depending on the load. Hence, its neater to use global synctask env. Change-Id: Id46f963a0190c0154513317ae03323db155ac15a Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 823774 Reviewed-on: http://review.gluster.com/3412 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Print non-zero timestamps in statedumpv3.3.0qa27Vijay Bellur2012-03-111-1/+4
| | | | | | | | | Change-Id: I686842bc0934128b825828066f613c2a6280ceb9 BUG: 801689 Reviewed-on: http://review.gluster.com/2912 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd/rebalance: Bring in support for parallel rebalanceshishirng2012-03-071-0/+17
| | | | | | | | | | | | | | | | | This patch, enables rebalance processes to be started on all nodes where the volume is spread across (1 process per node) node-uuid xattr identifies which node takes ownership of the task to migrate the file. The model employed is push (src pushes to dst) Change-Id: Ieacd46a6216cf6ded841bbaebd10cfaea51c16d6 BUG: 763844 Signed-off-by: shishirng <shishirng@gluster.com> Reviewed-on: http://review.gluster.com/2873 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* mempool: adjustments in pool sizesAmar Tumballi2012-02-221-1/+1
| | | | | | | | | | | | | | | | | * while creating 'rpc_clnt', the caller knows what would be the ideal load on it, so an extra argument to set some pool sizes * while creating 'rpcsvc', the caller knows what would be the ideal load of it, so an extra argument to set request pool size * cli memory footprint is reduced Change-Id: Ie245216525b450e3373ef55b654b4cd30741347f Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2784 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-211-0/+7
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Support for hardlink rebalance when decommissioningshishir gowda2012-02-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The support for hardlink rebalance is only available for decommissioning of a node. this can be triggered in two ways 1. remove-brick start 2. if decommission node value is set in vol file, then a normal rebalance command The way we handle it is- if (nlink > 1) do * if src file doesnt have linkto xattr * mark src's linkto to the dst * else * perform a link on the dst * do a look up * if nlinks = dst.nlinks * migrate data * else * continue crawling done Signed-off-by: shishir gowda <shishirng@gluster.com> Change-Id: If43b5524b872fd1413e9f7aa7f436cb244e30d8d BUG: 763844 Reviewed-on: http://review.gluster.com/2737 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* cluster/dht: Rebalance will be a new glusterfs processshishirng2012-02-191-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | rebalance will not use any maintainance clients. It is replaced by syncops, with the volfile. Brickop (communication between glusterd<->glusterfs process) is used for status and stop commands. Dept-first traversal of dir is maintained, but data is migrated as and when encounterd. fix-layout (dir) do Complete migrate-data of dir fix-layout (subdir) done Rebalance state is saved in the vol file, for restart-ability. A disconnect event and pidfile state determine the defrag-status Signed-off-by: shishirng <shishirng@gluster.com> Change-Id: Iec6c80c84bbb2142d840242c28db3d5f5be94d01 BUG: 763844 Reviewed-on: http://review.gluster.com/2540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* complete the implementation of missing 'f**xattr()' fopsAmar Tumballi2012-01-251-0/+1
| | | | | | | | | | | | | | in debug/* and cluster/* translators and a syncop_fsetxattr() added a test case for testing the working of 'f-fop()' on fuse mount. Change-Id: I0c2aeeb30a0fb382ef2495cca1e66b00abaffd35 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/802 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: add 'fremovexattr()' fopAmar Tumballi2012-01-251-0/+1
| | | | | | | | | | | so operations can be done on fd for extended attribute removal Change-Id: Ie026f1b53793aeb4ae33e96ea5408c7a97f34bf6 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/distribute: Add support for 'min-free-inodes' on each distribute ↵Harshavardhana2011-11-231-3/+16
| | | | | | | | | | | | | | | | | subvolume. This change is required as increasingly large number of small files would cause inodes to run out before they run out on available disk space. It is highly necessary to support algorithmic checking of inodes too just as we do for disk space. Change-Id: I9b87405328d443825e239ee80ab664aceb50ee68 BUG: 3799 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/730 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* glusterfs: An effort to fix all the spell mistakes and typoHarshavardhana2011-11-161-3/+3
| | | | | | | | | | | | | | | in the entire glusterfs codebase. This patch fixes many of spell mistakes and typo in the entire glusterfs codebase and all supported modules. Change-Id: I83238a41aa08118df3cf4d1d605505dd3cda35a1 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* statedump: do not print the inode number in the statedumpRaghavendra Bhat2011-10-011-36/+20
| | | | | | | | | | | | | Since gfid is used to uniquely identify a inode, in the statedump printing inode number is not necessary. Its suffecient if the gfid of the inode is printed. And do not print the the inodelks, entrylks and posixlks if the lock count is 0. Change-Id: Idac115fbce3a5684a0f02f8f5f20b194df8fb27f BUG: 3476 Reviewed-on: http://review.gluster.com/530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* support for de-commissioning a node using 'remove-brick'Amar Tumballi2011-09-131-6/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to achieve this, we now create volume-file with 'decommissioned-nodes' option in distribute volume, then just perform the rebalance set of operations (with 'force' flag set). now onwards, the 'remove-brick' (with 'start' option) operation tries to migrate data from removed bricks to existing bricks. 'remove-brick' also supports similar options as of replace-brick. * (no options) -> works as 'force', will have the current behavior of remove-brick, ie., no data-migration, volume changes. * start (starts remove-brick with data-migration/draining process, which takes care of migrating data and once complete, will commit the changes to volume file) * pause (stop data migration, but keep the volume file intact with extra options whatever is set) * abort (stop data-migration, and fall back to old configuration) * commit (if volume is stopped, commits the changes to volumefile) * force (stops the data-migration and commits the changes to volume file) Change-Id: I3952bcfbe604a0952e68b6accace7014d5e401d3 BUG: 1952 Reviewed-on: http://review.gluster.com/118 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* distribute rebalance: handle the open file migrationAmar Tumballi2011-09-121-18/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* xlator options: revamp xlator option validation/reconfigure codeAnand Avati2011-08-181-143/+22
| | | | | | | | | | | | | | | | | - move option handling to options.c (new file) - remove duplication of option validation code - remove duplication of gf_log / sprintf - get rid of xlator_t->validate_options - get rid of option validation in rpc-transport - get rid of validate_options() in every xlator - use xlator_volume_option_get to clean up many functions - introduce primitives to init/reconfigure option types Change-Id: I51798af72c8dc0a2b9e017424036eb3667dfc7ff BUG: 3415 Reviewed-on: http://review.gluster.com/235 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* distribute: bring in feature to do rebalance from insideAmar Tumballi2011-07-141-0/+9
| | | | | | | | | | | | | | | | do the 'rebalance' on a file with 'setxattr()' with 'distribute.migrate-data' key. the valid 'value' is "force" or anything else. current implementation of 'rebalance' is same as 'glusterd-rebalance', and hence comes with limitation of not rebalancing hardlinks, non-regular files, and files with open-fds. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3070 (After 'rebalance', the gfid of the file gets changed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3070
* cluster/distribute: bring in directory-spread-count optionAmar Tumballi2011-07-141-0/+35
| | | | | | | | | | global spread count option is given through volume file Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2257 (enhance distribute to control the spread count (ie, control the hashing range)) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2257
* dht: changes in volume_options to assist volume set help/help-xmlKaushik BV2011-07-121-3/+26
| | | | | | | | Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2041 (volume set help option) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2041
* cluster/distribute: remove the unused variable from 'conf'Amar Tumballi2011-06-161-9/+0
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3013 (error in the log file upon taking the statedump) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3013
* DHT: Make assert-no-child-down a boolean optionshishir gowda2011-04-151-11/+11
| | | | | | | | Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2765 (geo-replication should have mercy on brick failure) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2765
* DHT: Add xlator-option assert_no_child_downshishir gowda2011-04-141-0/+17
| | | | | | | | | | If this is set, when CHILD_DOWN event is received, call exit Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2536 (gsync service introspection) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2536
* CLI : Validate options farmework.Gaurav2011-03-231-23/+20
| | | | | | | | Signed-off-by: Gaurav <gaurav@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2064 (NFS options are removed upon glusterd restart) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2064
* cluster/dht: log enhancementsAmar Tumballi2011-03-171-35/+32
| | | | | | | | | Signed-off-by: Shishir Gowda <shishirng@gluster.com> Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/dht: whitespace cleanupAmar Tumballi2011-03-171-152/+152
| | | | | | | | | | also fill tabs by spaces (untabify), and indent the code Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/dht : implement a fsetxattr() commandAmar Tumballi2011-02-181-0/+1
| | | | | | | | | | Thanks to Jeff Darcy <jdarcy(at)redhat.com> for the patch Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2319 (DHT uses wrong subvolume for fsetxattr) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2319
* cluster/distribute: fix warnings due to format string mismatches during ↵Raghavendra G2010-12-141-2/+2
| | | | | | | | | | invocation of gf_log. Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2211 ((re)introduce warnings for format string/parameter mismatch) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2211
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* Reply back to CLI on error, by validating each xlator's optsKaushik BV2010-10-051-0/+39
| | | | | | | | Signed-off-by: Kaushik BV <kaushikbv@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1159 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1159