summaryrefslogtreecommitdiffstats
path: root/xlators/features
Commit message (Collapse)AuthorAgeFilesLines
...
* glusterfs: zerofill supportM. Mohan Kumar2013-11-101-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* client_t: phase 2, refactor server_ctx and locks_ctx outKaleb S. KEITHLEY2013-10-317-117/+423
| | | | | | | | | | | | | | remove server_ctx and locks_ctx from client_ctx directly and store as into discrete entities in the scratch_ctx hooking up dump will be in phase 3 BUG: 849630 Change-Id: I94cea328326db236cdfdf306cb381e4d58f58d4c Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/5678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/gfid-access: Handle inode remap when parent inode is NULLPranith Kumar K2013-10-311-7/+2
| | | | | | | | | | Change-Id: Ic3f9d30d75df0bbbdf8fe28446fabe62d90fa854 BUG: 1024666 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6194 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Add monotonic clocking counter for timer threadHarshavardhana2013-10-151-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gettimeofday() returns the current wall clock time and timezone. Using these functions in order to measure the passage of time (how long an operation took) therefore seems like a no-brainer. This time suffer's from some limitations: a. They have a low resolution: “High-performance” timing by definition, requires clock resolutions into the microseconds or better. b. They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive). In such cases timer thread could go into an infinite loop. From 'man gettimeofday': ---------- .. .. The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). .. .. ---------- Rationale: For calculating interval timing for Timer thread, all that’s needed should be clock as a simple counter that increments at a stable rate. This is necessary to avoid the jumps which are caused by using "wall time", this counter must be monotonic that can never “tick” backwards, ever. Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb BUG: 1017993 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: [Feature] Command implementation to get heal-countVenkatesh Somyajulu2013-10-142-13/+316
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently to know the number of files to be healed, either user has to go to backend and check the number of entries present in indices/xattrop directory. But if a volume consists of large number of bricks, going to each backend and counting the number of entries is a time-taking task. Otherwise user can give gluster volume heal vol-name info command but with this approach if no. of entries are very hugh in the indices/ xattrop directory, it will comsume time. So as a feature, new command is implemented. Command 1: gluster volume heal vn statistics heal-count This command will get the number of entries present in every brick of a volume. The output displays only entries count. Command 2: gluster volume heal vn statistics heal-count replica 192.168.122.1:/home/user/brickname Here if we are concerned with just one replica. So providing any one of the brick of a replica will get the number of entries to be healed for that replica only. Example: Replicate volume with replica count 2. Backend status: -------------- [root@dhcp-0-17 xattrop]# ls -lia | wc -l 1918 NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy entries so actual no. of entries to be healed are 1916. [root@dhcp-0-17 xattrop]# pwd /home/user/2ty/.glusterfs/indices/xattrop Command output: -------------- Gathering count of entries to be healed on volume volume3 has been successful Brick 192.168.122.1:/home/user/22iu Status: Brick is Not connected Entries count is not available Brick 192.168.122.1:/home/user/2ty Number of entries: 1916 Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290 BUG: 1015990 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/6044 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/qemu-block: invoke bdrv_init() only onceBrian Foster2013-10-091-1/+5
| | | | | | | | | | | | | | | | bdrv_init() is intended to be invoked only once. If invoked again after initialization (i.e., due to graph changes), the block driver registration code can reinsert entries into bdrv_drivers, effectively corrupting the NULL terminated linked list. This manifests as an infinite loop for callers attempt to traverse the list (to locate a driver). BUG: 986775 Change-Id: I2f95bda3c753dece4648cdad009d0e2a2d16f644 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/6021 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/changelog : Improvement in changelog "encoding-change".ajha2013-09-294-17/+26
| | | | | | | | | | | | | | | | | change in encoding method of changelog was critical section for "fop dispatch thread", "roll-over thread" and "reconfigure dispatch thread". In this patch the "encoding-method" is changed by the reconfigure dispatch thread lazily during handle_change, which solves the concurrency among the racing threads. BUG: 1002940 Change-Id: I78c3e8887efa46d0fcc60755cdf4243031cfa3eb Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/5844 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
* core: block unused signals in created threadsAnand Avati2013-09-253-9/+9
| | | | | | | | | | | | | | | Block all signal except those which are set for explicit handling in glusterfs_signals_setup(). Since thread spawning code in libglusterfs and xlators can get called from application threads when used through libgfapi, it is necessary to do this blocking. Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281 BUG: 1011662 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5995 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* qemu-block: support readdirpBrian Foster2013-09-171-0/+58
| | | | | | | | | | | | | | Support the readdirp fop in qemu-block to ensure that image files are handled correctly when readdirp is enabled. E.g., without readdirp support, incorrect stat data for formatted files can be reported back (and cached) via the client. BUG: 986775 Change-Id: Ibc4bd0b42548953ebe30832f3d853bb68095f0ac Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5946 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* qemu-block: fix building from distribution tarball when glib2-devel is installedNiels de Vos2013-09-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | Building RPMs from a 'make dist' tarball fails when qemu-block is enabled. Enabling is done automatically when the glib2 development files are available (enabled by ./configure). Manual building with: $ ./autogen.sh && ./configure && make dist && rpmbuild -ta *.tar.gz Building in mock works fine, glib2-devel is not installed by default so the qemu-block xlator gets disabled. This change also adds glib2-devel to the BuildRequires in the glusterfs.spec file, causing the qemu-block xlator to be built by default, and included in the glusterfs RPM. Change-Id: Ibb73628772586d9e07bbfde7a8ff2fc973489086 BUG: 986775 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5896 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gfid-access: Error logs for ga_newfile_parse_argsAvra Sengupta2013-09-051-11/+50
| | | | | | | | | | Change-Id: I7aab98a70793bee936272f0b795f4c22c3733dd2 BUG: 1001055 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5705 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* features/marker: force xtime updates (configurable) for client-pid = -1Venky Shankar2013-09-042-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is required by Geo-Replication that does auxillary mount with client-pid as -1 (which has special treatment at specific places in GlusterFS), to trigger xtime updates on the intermediate master in a cascading setup. Marker too had a check to "not" mark updates for geo-replication's auxillary mounts. With the new geo-replication design, xtimes are not set by the master on the slave for all entities. Due to this cascading setups were broken. This patch introduces "geo-replication.ignore-pid-check" option as a "override" for the client-pid check for gsyncd's client-pid. When this options is enabled, marker start "marking" even if the updates are from the special client. Geo-Replication on the detection of itself being an intermediate master, enables this option. Change-Id: I9f7140edd12fef5480595ee0f93f35b94cdb8345 BUG: 996371 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5591 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: Introduce basic crawl instrumentationVenky Shankar2013-09-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the persistent instrumentation work done by Aravinda (@avishwa), by introducing a handfull of instrumentation variables for crawl. These variables are "pulled up" by glusterd in the event of a geo-replication status cli command and looks something like below: "Uptime=00:21:10;FilesSyned=2982;FilesPending=0;BytesPending=0;DeletesPending=0;" "FilesPending", "BytesPending" and "DeletesPending" are short-lived variables that are non-zero when a changelog is being processes (ie. when an active sync in ongoing). After a successfull changelog process "FilesPending" is summed up into "FilesSynced". The three short-lived variabled are then reset to zero and the data is persisted Additionally this patch also reverts some of the changes made for BZ #986929 (those were not needed). Change-Id: I948f1a0884ca71bc5e5bcfdc017d16c8c54fc30b BUG: 990420 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5441 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/qemu-block: support for QCOW2 and QED formatsAnand Avati2013-09-0314-1/+2757
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for internals snapshots using QCOW2 and general framework for external snapshots (next patch) with QCOW2 and QED. For internal snapshots, the file must be "initialized" or "formatted" into QCOW2 format, and specify a file size. Snapshots can be created, deleted, and applied ("goto"). e.g: // Format and Initialize sh# setfattr -n trusted.glusterfs.block-format -v qcow2:10GB /mnt/imgfile sh# ls -l /mnt/imgfile -rw-r--r-- 1 root root 10G Jul 18 21:20 imgfile // Create a snapshot sh# setfattr -n trusted.glusterfs.block-snapshot-create -v name1 imgfile // Apply a snapshot sh# setfattr -n trusted.gluterfs.block-snapshot-goto -v name1 imgfile Change-Id: If993e057a9455967ba3fa9dcabb7f74b8b2cf4c3 BUG: 986775 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* features/locks : Improves debuggability of inode/entry locks.Anuradha Talur2013-09-036-41/+94
| | | | | | | | | | | | | | | | | Prints, in the statedump, the information about the mount that performed the inode/entry lk. For the entrylks that are granted after a blocked state, the blocked time is not printed. A patch for that will be sent later. Change-Id: Ib0c1ed21fa9328b435f96b590dd343f59814a08d BUG: 915629 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/5712 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: remove GLUSTERFS_CREATE_MODE_KEY usageAmar Tumballi2013-08-231-12/+5
| | | | | | | | | Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6 BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gfid-access: virtual access to filesystem through gfid pathAmar Tumballi2013-08-216-1/+1313
| | | | | | | | | BUG: 952029 Change-Id: I7405d473d369a4a951836eceda4faccbad19ce0e Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5497 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Try to start all bricks on 'start force'Kaushal M2013-08-181-4/+6
| | | | | | | | | | | | | | | | | | | A volume would fail to start if any one of the bricks fails staging or fails to start, even with the 'force' option. With this patch, when the 'force' option is given for a volume start, glusterd will continue and start other bricks even if one fails staging or starting. Also did a small fix in changelog, to prevent it crashing when it fails to init. Change-Id: I7efbd9ab13d12d69b0335ae54143fa17586f8f98 BUG: 994375 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5510 Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* log: set ident to openlogBala.FA2013-08-131-1/+2
| | | | | | | | | | | | | | | | | | | | at syslog side, log message is identified by its properties like programname, pid, etc. brick/mount processes need to be identified uniquely as they are different process of gluterfsd/glusterfs. At rsyslog side, log separated by programname/app-name with pid works but bit hard to identify them in long run which process is for what brick/mount. This patch fixes by setting identity string at openlog() which sets programname/app-name as similar to old style log file prefixed by gluster, glusterd, glusterfs or glusterfsd Change-Id: Ia05068943fa67ae1663aaded1444cf84ea648db8 BUG: 928648 Signed-off-by: Bala.FA <barumuga@redhat.com> Reviewed-on: http://review.gluster.org/5541 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/locks: Convert old style metadata locks to new-stylePranith Kumar K2013-08-071-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In 3.3, inode locks of both metadata and data are competing in same domain called data domain (old style). This coupled with eager-lock, delayed post-ops introduce delays for metadata operations like chmod, chown etc. To avoid this problem, inode locks for metadata ops are moved to different domain called metadata domain in 3.4 (new style). But when both 3.3 clients and 3.4 clients are present, 3.4 clients for metadata operations still need to take locks in "old style" so that proper synchronization happens across 3.3 and 3.4 clients. Only when all clients are >= 3.4 locks will be taken in "new style" for metadata locks. Because of this behavior as long as at least one 3.3 client is present, delays will be perceived for doing metadata operations on all 3.4 clients while data operations are in progress (Ex: Untar will untar one file per sec). Fix: Make locks xlators translate old-style metadata locks to new-style metadata locks. Since upgrade process suggests upgrading servers first and then clients, this approach gives good results. Tests: 1) Tested that old style metadata locks are converted to new style by locks xlator using gdb 2) Tested that disconnects purge locks in meta-data domain as well using gdb and statedumps. 3) Tested that untar performance is not hampered by meta-data and data operations. 4) Had two mounts one with orthogonal-meta-data on and other with orthogonal-meta-data off ran chmod 777 <file> on one mount and chmod 555 <file> on the other mount in while loops when I took statedumps I saw that both the transports are taking lock on same domain with same range. 18:49:30 :) ⚡ sudo grep -B1 "ACTIVE" /usr/local/var/run/gluster/home-gfs-r2_0.324.dump.* home-gfs-r2_0.324.dump.1375794971-lock-dump.domain.domain=r2-replicate-0:metadata home-gfs-r2_0.324.dump.1375794971:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 7525, owner=78f9e652497f0000, transport=0x15ac9e0, , granted at Tue Aug 6 18:46:11 2013 home-gfs-r2_0.324.dump.1375795051-lock-dump.domain.domain=r2-replicate-0:metadata home-gfs-r2_0.324.dump.1375795051:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 8879, owner=0019cc3cad7f0000, transport=0x158f580, , granted at Tue Aug 6 18:47:31 2013 Change-Id: I268df4efd93a377a0c73fbc59b739ef12a7a8bb6 BUG: 993981 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* libglusterfs/client_t client_t implementation, phase 1Kaleb S. KEITHLEY2013-07-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Implementation of client_t The feature page for client_t is at http://www.gluster.org/community/documentation/index.php/Planning34/client_t In addition to adding libglusterfs/client_t.[ch] it also extracts/moves the locktable functionality from xlators/protocol/server to libglusterfs, where it is used; thus it may now be shared by other xlators too. This patch is large as it is. Hooking up the state dump is left to do in phase 2 of this patch set. (N.B. this change/patch-set supercedes previous change 3689, which was corrupted during a rebase. That change will be abandoned.) BUG: 849630 Change-Id: I1433743190630a6d8119a72b81439c0c4c990340 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/3957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* features/changelog: fixes when enabling changelogVenky Shankar2013-07-252-29/+41
| | | | | | | | | | | | | Other enhancements being: * ignore fops made by rebalance * ignore internally triggered fops BUG: 987734 Change-Id: I7dd164ae3c209fdb8ec43a27e67b8846f937c93b Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5380 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/marker: pass xdata in marker_unlink()Venky Shankar2013-07-242-1/+6
| | | | | | | | | | Change-Id: Ia310af96b25f29351f3adc4bbc97aea271df7673 BUG: 987747 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5379 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* move 'xlators/marker/utils/' to 'geo-replication/' directoryAvra Sengupta2013-07-2219-3982/+1
| | | | | | | | | | Change-Id: Ibd0faefecc15b6713eda28bc96794ae58aff45aa BUG: 847839 Original Author: Amar Tumballi <amarts@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5133 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-2225-3/+4974
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* locks: Added an xdata-based 'cmd' for inodelk count in a given domainshishir gowda2013-07-184-20/+67
| | | | | | | | | | | | | | | | | | | | Following is the semantics of the 'cmd': 1) If @domain is NULL - returns no. of locks blocked/granted in all domains 2) If @domain is non-NULL- returns no. of locks blocked/granted in that domain 3) If @domain is non-existent - returns '0'; This is important since locks xlator creates a domain in a lazy manner. where @domain - a string representing the domain. Change-Id: I5e609772343acc157ca650300618c1161efbe72d BUG: 951195 Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4889 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* glusterfs: discard (hole punch) supportBrian Foster2013-06-131-0/+70
| | | | | | | | | | | | | | | | Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gluster: add fallocate fop supportBrian Foster2013-06-132-0/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* configure.ac: build glupy with installed pythonKaleb S. KEITHLEY2013-05-192-12/+3
| | | | | | | | | | | | | | IOW with more than just python2.6. Python2.7 is certainly what's on the vast majority of non-RHEL systems that are out there. Also our rpm.t regression test will build on epel-5 under mock; RHEL5 has Python2.4. Change-Id: I09c95c1fb6b3498e910ad239c4f0af7f786c3700 BUG: 961856 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/5007 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* locks: fix leaking entrylk lock structureAnand Avati2013-05-161-0/+2
| | | | | | | | | | | | | When entrylk lock requests are blocked and granted aysnchronously, the entrylk lock structure was getting leaked. Change-Id: Ie3f29f550730189f27745d991b029e50c63e63da BUG: 962350 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4991 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glupy patch by Ram, Justin: Add/Modify fops, structure types, utility fnsRam Raja2013-05-138-162/+3773
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the following fops with Python: * open * readv * writev * opendir * readdir * readdirp * stat * fstat * statfs * setxattr * getxattr * fsetxattr * fgetxattr * removexattr * fremovexattr * link * unlink * readlink * symlink * mkdir * rmdir Add fd_t, inode_t and iatt_t structure types. Modify loc_t structure type; Alter the data types of the following attributes - inode, parent, gfid, pargfid. Modify uuid2str function, which returns a string equivalent for a ctype object representing a gfid, to make use of python's 'uuid' module for accurate representation of uuids. by Justin Clift: Adjust debug-trace.py to work with Python 2.6 Work around 'zero length field name in format' bug in negative.py's uuid2str function Fix indentation errors in negative.py, glupy.h, glupy.c, gluster.py Change-Id: If0fcfb2866e21c0380a973f8ffab9ea7b6a4cd5d BUG: 961856 Signed-off-by: Ram Raja <rraja@redhat.com> Reviewed-on: http://review.gluster.org/4907 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Justin Clift <jclift@redhat.com> Tested-by: Justin Clift <jclift@redhat.com>
* glupy: Importing Jeff's glupy project into glusterfsRam Raja2013-05-109-1/+750
| | | | | | | | | | Change-Id: I3891ef6eaf6ede7c8cbedc3298ce2501a69b2b05 BUG: 961856 Original-author: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Ram Raja <rraja@redhat.com> Reviewed-on: http://review.gluster.org/4906 Reviewed-by: Justin Clift <jclift@redhat.com> Tested-by: Justin Clift <jclift@redhat.com>
* gsync: Display additional information in status commandsarvotham s pai2013-04-042-4/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | Added code to display extra information when status command is executed. Information shown now are 1 Number of files synced 2 crawl time 3 total sync time 4 bytes synced bytes synced is taken from rsync output . --stats option of rsync gives extra infor mation about the sync.In stats output there is a field called Total transferred file size which states the ammount of bytes synced . This information is parsed from stdout output using regular expressions.Bytes synced information can be used to calculate throughput. Change-Id: Id9bba9fff45ee7049bb8257c6fd918e5237e05b1 BUG: 947774 Signed-off-by: sarvotham s pai <spai@redhat.com> Reviewed-on: http://review.gluster.org/4749 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/marker: log error when unlinking timestamp fileVenky Shankar2013-03-211-4/+13
| | | | | | | | | | | | | | | | ... so it's easy to figure out errno caused it. As of now it's only due to ENOSPC. Logging is done in the error handling routine, so any further changes that require unlinking of the timestamp file due to some error condition(s) are logged. Change-Id: Ia59338e2e32b2adbbd1d56aa260018270f1abae9 BUG: 853911 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4649 Reviewed-by: Csaba Henk <csaba@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: retire old style ssh setupCsaba Henk2013-03-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Users are still using geo-rep with the old, deprecated, insecure, unsupported ssh setup. Not their fault -- the implementation of the new method had the following charasteristics: - old method is possible, but with default settings it's not working - it can be made operational by fiddling with "remote-gsyncd" tunable - with default setting, an unhelpful, actually misleading error message is produced - the UI gave no hint to the changes in the ssh setup http://review.gluster.org/4392 tried to fix these; what it accomplished was unrestricted support to the bad practice (by making the default old setup operational). From this on: - we disable the old method by reserving the "remote-gsyncd" tunable - if the old method is attempted, give a hint what to do Change-Id: Icade94725d8d8d2d4c89cab992d4226351637b86 BUG: 895656 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4602 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/quota: Add option to consider the quota limit in statfs estimationVarun Shastry2013-02-192-3/+21
| | | | | | | | | | | | | | | Adds an option, features.quota-deem-statfs (default off) to consider the quota limits while calculating the volume stats. Eg: Backend is of size 10GB and limit set on / is 5GB. If the option is off df show actual size to be 10GB and when it is on df shows 5GB. Change-Id: Ib30733bb69afecce1dea9d0491af89d551d214cc BUG: 905425 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/4511 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features: add a directory-protection translatorJeff Darcy2013-02-176-1/+460
| | | | | | | | | | | | | | | | | | | | | | This is useful to find all calls that remove a file from the protected directory, including renames and internal calls. Such calls will cause a stack trace to be logged. There's a filter script to add the needed translators, and then the new functionality can be invoked with one of the following commands. setfattr -n trusted.glusterfs.protect -v log $dir setfattr -n trusted.glusterfs.protect -v reject $dir setfattr -n trusted.glusterfs.protect -v anything_else $dir The first logs calls, but still allows them. The second rejects them with EPERM. The third turns off protection for that directory. Change-Id: Iee4baaf8e837106be2b4099542cb7dcaae40428c BUG: 888072 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4496 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd: allow the override of the compiled-in python pathJoe Julian2013-02-071-3/+7
| | | | | | | | | | | .. using the environment variable $PYTHON Change-Id: Ieaad8be98b826c803268216826e250d9944c8190 BUG: 882127 Signed-off-by: Joe Julian <me@joejulian.name> Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4252 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Use proper libtool option -avoid-version instead of bogus -avoidversionAnand Avati2013-02-0710-11/+11
| | | | | | | | | | Change-Id: I1c9541058c7d07786539a3266ca125a6a15287d8 BUG: 859835 Signed-off-by: Anand Avati <avati@redhat.com> Original-author: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com> Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com> Reviewed-on: http://review.gluster.org/3967 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep / gsyncd: Separate log file directory for Mountbroker sessionsVenky Shankar2013-02-041-1/+5
| | | | | | | | | | | | | | | | | | | | | ... so that a mountbroker session which is initiated b/w master and slave does not use the same log file if it's started after a normal geo-rep session b/w master and slave. This results in EPERM as the log file is owned by root and the geo-rep slave process (now running as a non privileged user) does not have access to it. Also, having separate log file directory for mountbroker sessions looks clean. NOTE: geo-rep's client mount log file location remains unchanged. Change-Id: Ic7a732e250aee5393b9c3f6ebf6dfe2c310b7fe4 BUG: 893960 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4407 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* locks: Protected racy (read) access of ext_listKrishnan Parthasarathi2013-02-031-10/+18
| | | | | | | | | | Change-Id: Ibf639695ebd99c11c6960c9be82c0cee71b50744 BUG: 905864 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4458 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: fixes for gcc's '-pedantic' flag buildAvra Sengupta2013-01-214-10/+5
| | | | | | | | | | | | | * warnings on 'void *' arguments * warnings on empty initializations * warnings on empty array (array[0]) Change-Id: Iae440f54cbd59580eb69f3ecaed5a9926c0edf95 BUG: 875913 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4219 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: remove all the 'inner' functions in codebaseAmar Tumballi2012-12-191-1/+3
| | | | | | | | | | | | | | | | * move 'dict_keys_join()' from api/glfs_fops.c to libglusterfs/dict.c - also added an argument which is treated as a filter function if required, currently useful for fuse. * now 'make CFLAGS="-std=gnu99 -pedantic" 2>&1 | grep nested' gives no output. Change-Id: I4e18496fbd93ae1d3942026ef4931889cba015e8 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 875913 Reviewed-on: http://review.gluster.org/4187 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/locks : Made changes to display brick information on clearing locks.Avra Sengupta2012-12-181-16/+101
| | | | | | | | | Change-Id: I664614677bc887ce087bfca067e6e57f0d6b659d BUG: 824753 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4272 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: do not access BaseException.message in syncdutilsNiels de Vos2012-12-181-2/+2
| | | | | | | | | | | | | | http://www.python.org/dev/peps/pep-0352/ explains that the .message property of BaseException is being removed. Most of the other exception handlers access <Exception>.args[] which should be suitable for this case too. Change-Id: I1810450b78d2b3d7f8bd07f2beb02cbe9e2adecb BUG: 888346 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4328 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: play nicely with peer multiplexing when setting a checkpointCsaba Henk2012-12-041-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | The gsyncd invocation that instruments the "geo-rep config" command is multiplexed over peers to ensure the uniformity of configuration. In general, that works well, but checkpoint setting is a special case, because (unlike other instances of config-set) it is logged (as recording of checkpoint events is part of the feature). Problem is that the path components leading to the log file are created only on the original node, where gsyncd was started. Therefore the logging attempt will fail on the other nodes. Fix: ignore if opening the logfile on behalf of checkpoint setting fails with ENOENT. Change-Id: I677f3f081bf4b9e3ba4d25d58979d86931e6beb4 BUG: 881997 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4248 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd,glusterd: do not hardcode socket pathCsaba Henk2012-11-282-2/+5
| | | | | | | | | | | | ... in gsyncd python code. Indeed, use the configuration mechanism to set it suitably from glusterd. Change-Id: I9fe2088b14d28588d1e64fe892740cc5755b8365 BUG: 868877 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4143 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-replication: catch select.error on select()Niels de Vos2012-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tailer() in resource.py does not correctly catch exceptions from select(). select() can raise an instance of the select.error class and the current expression only catches ValueError (and the instance will have reference called selecterror). The geo-rep log contains a call trace like this: > E [syncdutils:190:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 216, in twrap > tf(*aa) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 123, in tailer > poe, _ ,_ = select([po.stderr for po in errstore], [], [], 1) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 276, in select > return eintr_wrap(oselect.select, oselect.error, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 269, in eintr_wrap > return func(*a) > error: (9, 'Bad file descriptor') BUG: 880308 Change-Id: I2babe42918950d0e9ddb3d08fa21aa3548ccf7c5 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4233 Reviewed-by: Peter Portante <pportant@redhat.com> Reviewed-by: Csaba Henk <csaba@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/locks: implement fgetxattr and fsetxattrRaghavendra G2012-11-272-0/+342
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | implement xattrs for GF_XATTR_LOCKINFO_KEY, which will be used for posix-locks migration from old to new graph after a switch. fgetxattr (fd, GF_XATTR_LOCKINFO_KEY) will return a dict. This dict has a serialized dict stored for key GF_XATTR_LOCKINFO_KEY. This serialized dict in turn has fdnum value of locks acquired on this fd with modified pathinfo (containing hostname and base directory components) as key. fsetxattr (newfd, GF_XATTR_LOCKINFO_KEY, dict) has following semantics. dict can be the result of a previous fgetxattr with GF_XATTR_LOCKINFO_KEY. In that case, a dict_get on dict constructed using serialized buffer is done on modified pathinfo as key. If a value is got, that value is treated as fdnum and for every lock l on newfd->inode we do, if (l->fdnum == fdnum) { l->fdnum = fd_fdnum (newfd); l->transport = <connection identifier of connection on which fsetxattr came>; } Signed-off-by: Raghavendra G <raghavendra@gluster.com> Change-Id: I73a8f43aa0b6077bc19f8de52205ba748f2d8bbe BUG: 808400 Reviewed-on: http://review.gluster.org/4120 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Put _check_key_is_zero_filled outside _xattrop_index_actionVenkatesh Somyajulu2012-11-201-11/+13
| | | | | | | | | Change-Id: Ifb89a3a911213b2816a540a104558e7c3c13e23a BUG: 874498 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>