glusterfs-snapshot.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	glusterfs: zerofill support	M. Mohan Kumar	2013-11-10	1	-16/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	posix: Fix excessive logging resulting from get_real_filename failure from samba	Krutika Dhananjay	2013-11-07	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	Change-Id: I641d028165da7b8501bd372c62d2df89a9d4db1f BUG: 1027174 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/6237 Reviewed-by: poornima g <pgurusid@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	posix: Fix readv FOP	M. Mohan Kumar	2013-10-17	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \|	Suggested by Anand Avati in BD xlator code review. Change-Id: I31c353a26dfdeb3d0023c3f7e03ed25461d13c16 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> BUG: 837495 Reviewed-on: http://review.gluster.org/6077 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
*	storage/posix: Lower log severity for ENODATA errors in getxattr()	Vijay Bellur	2013-10-15	1	-1/+2
\| \| \| \| \| \| \| \| \|	Change-Id: I101e329cce4c1305615c63ebcb42355f6c3e85e0 BUG: 918052 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6084 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	gNFS: Incorrect NFS ACL encoding for XFS	Santosh Kumar Pradhan	2013-09-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Incorrect NFS ACL encoding causes "system.posix_acl_default" setxattr failure on bricks on XFS file system. XFS (potentially others?) doesn't understand when the 0x10 prefix is added to the ACL type field for default ACLs (which the Linux NFS client adds) which causes setfacl()->setxattr() to fail silently. NFS client adds NFS_ACL_DEFAULT(0x1000) for default ACL. FIX: Mask the prefix (added by NFS client) OFF, so the setfacl is not rejected when it hits the FS. Original patch by: "Richard Wareing" Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396 BUG: 1009210 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/5980 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	core: block unused signals in created threads	Anand Avati	2013-09-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Block all signal except those which are set for explicit handling in glusterfs_signals_setup(). Since thread spawning code in libglusterfs and xlators can get called from application threads when used through libgfapi, it is necessary to do this blocking. Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281 BUG: 1011662 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5995 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	core: remove GLUSTERFS_CREATE_MODE_KEY usage	Amar Tumballi	2013-08-23	1	-14/+0
\| \| \| \| \| \| \| \| \|	Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6 BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	core: changes to support gfid-access	Amar Tumballi	2013-08-21	1	-0/+14
\| \| \| \| \| \| \| \| \|	Change-Id: I38d2fdc47e4b805deafca6805e54807976ffdb7e Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 952029 Reviewed-on: http://review.gluster.org/5496 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	Revert "fuse: auxiliary gfid mount support"	Amar Tumballi	2013-08-21	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20. Conflicts: xlators/mount/fuse/src/fuse-bridge.c For build issues added CREATE_MODE_KEY definition in: libglusterfs/src/glusterfs.h Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	afr: treat appending writes as stable writes.	Anand Avati	2013-08-13	1	-2/+39
\| \| \| \| \| \| \| \| \| \| \| \| \|	Durability of appending writes is implicit in the file size. Therefore performing an explicit fsync() is unnecessary in such cases as self-heal can check for the size of file when pending changelog is not unambiguous. Change-Id: I05446180a91d20e0dbee5de5a7085b87d57f178a BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	posix: Default value for `batch-fsync-delay-usec` should be '0'	Harshavardhana	2013-08-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Also fixes for failing testcase `./tests/bugs/bug-888174.t`, which has been failing sporadically for many patches. Change-Id: Ic7d2c95da5d3126623cec403207afadd449bf950 BUG: 927146 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: Enable Open-fd-count query in writev	Pranith Kumar K	2013-08-02	1	-1/+39
\| \| \| \| \| \| \| \| \| \|	Change-Id: I86bdf865730416150c10617dcbad5c037579acde BUG: 910217 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5433 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Revert "storage/posix: Remove the interim fix that handles the gfid race"	shishir gowda	2013-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 97807e75956a2d240282bc64fab1b71762de0546. In a distribute or distribute-replica volume, this fix is required to prevent gfid mis-match due to race issues. test script bug-767585-gfid.t needs a sleep of 2, cause after setting backend gfid directly, we try to heal, and with this fix, we do not allow setxattr of gfid within creation of 1 second if not created by itself Change-Id: Ie3f4b385416889fd5de444638a64a7eaaf24cd60 BUG: 951195 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5240 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
*	storage/posix: implement batched fsync in a single thread	Anand Avati	2013-07-23	1	-0/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because of the extra fsync()s issued by AFR transaction, they could potentially "clog" all the io-threads denying unrelated operations from making progress. This patch assigns a dedicated thread to issues fsyncs, as an experimental feature to understand performance characteristics with the approach. As a basis, incoming individual fsync requests are grouped into batches, falling in the same @batch-fsync-delay-usec window of time. These windows can extend in practice, as processing of the previous batch can take longer than @batch-fsync-delay-usec while new requests are getting batched. The feature support three modes (similar to the -S modes of fs_mark) - syncfs: In this mode one syncfs() is issued per batch, instead of N fsync()s (one per file.) - syncfs-single-fsync: In this mode one syncfs() is issued per batch (which, on Linux, guarantees the completion of write-out of dirty pages in the filesystem up to that point) and one single fsync() to synchronize or flush the controller/drive cache. This corresponds to -S 2 of fsmark. - syncfs-reverse-fsync: In this mode, one syncfs() is issued per batch, and all the open files in that batch are fsync()'ed in the reverse order of the queue. This corresponds to -S 4 of fsmark. - reverse-fsync: In this mode, no syncfs() is issued and all the files in the batch are fsync()'ed in the reverse order. This corresponds to -S 3 of fsmark. Change-Id: Ia1e170a810c780c8d80e02cf910accc4170c4cd4 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4746 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	fuse: auxiliary gfid mount support	Raghavendra G	2013-07-19	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* files can be accessed directly through their gfid and not just through their paths. For eg., if the gfid of a file is f3142503-c75e-45b1-b92a-463cf4c01f99, that file can be accessed using <gluster-mount>/.gfid/f3142503-c75e-45b1-b92a-463cf4c01f99 .gfid is a virtual directory used to seperate out the namespace for accessing files through gfid. This way, we do not conflict with filenames which can be qualified as uuids. * A new file/directory/symlink can be created with a pre-specified gfid. A setxattr done on parent directory with fuse_auxgfid_newfile_args_t initialized with appropriate fields as value to key "glusterfs.gfid.newfile" results in the entry <parent>/bname whose gfid is set to args.gfid. The contents of the structure should be in network byte order. struct auxfuse_symlink_in { char linkpath[]; /* linkpath is a null terminated string / } __attribute__ ((__packed__)); struct auxfuse_mknod_in { unsigned int mode; unsigned int rdev; unsigned int umask; } __attribute__ ((__packed__)); struct auxfuse_mkdir_in { unsigned int mode; unsigned int umask; } __attribute__ ((__packed__)); typedef struct { unsigned int uid; unsigned int gid; char gfid[UUID_CANONICAL_FORM_LEN + 1]; / a null terminated gfid string * in canonical form. / unsigned int st_mode; char bname[]; / bname is a null terminated string / union { struct auxfuse_mkdir_in mkdir; struct auxfuse_mknod_in mknod; struct auxfuse_symlink_in symlink; } __attribute__ ((__packed__)) args; } __attribute__ ((__packed__)) fuse_auxgfid_newfile_args_t; An initial consumer of this feature would be geo-replication to create files on slave mount with same gfids as that on master. It will also help gsyncd to access files directly through their gfids. gsyncd in its newer version will be consuming a changelog (of master) containing operations on gfids and sync corresponding files to slave. Also, bring in support to heal gfids with a specific value. fuse-bridge sends across a gfid during a lookup, which storage translators assign to an inode (file/directory etc) if there is no gfid associated it. This patch brings in support to specify that gfid value from an application, instead of relying on random gfid generated by fuse-bridge. gfids can be healed through setxattr interface. setxattr should be done on parent directory. The key used is "glusterfs.gfid.heal" and the value should be the following structure whose contents should be in network byte order. typedef struct { char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid * string in canonical form / char bname[]; / a null terminated basename */ } __attribute__((__packed__)) fuse_auxgfid_heal_args_t; This feature can be used for upgrading older geo-rep setups where gfids of files are different on master and slave to newer setups where they should be same. One can delete gfids on slave using setxattr -x and .glusterfs and issue stat on all the files with gfids from master. Thanks to "Amar Tumballi" <amarts@redhat.com> and "Csaba Henk" <csaba@redhat.com> for their inputs. Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ie8ddc0fb3732732315c7ec49eab850c16d905e4e BUG: 952029 Reviewed-on: http://review.gluster.com/#/c/4702 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/4702 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	posix: add a simple health-checker	Niels de Vos	2013-07-03	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Goal of this health-checker is to detect fatal issues of the underlying storage that is used for exporting a brick. The current implementation requires the filesystem to detect the storage error, after which it will notify the parent xlators and exit the glusterfsd (brick) process to prevent further troubles. The interval the health-check runs can be configured per volume with the storage.health-check-interval option. The default interval is 30 seconds. It is not trivial to write an automated test-case with the current prove-framework. These are the manual steps that can be done to verify the functionality: - setup a Logical Volume (/dev/bz970960/xfs) and format is as XFS for brick usage - create a volume with the one brick # gluster volume create failing_xfs glufs1:/bricks/failing_xfs/data # gluster volume start failing_xfs - mount the volume and verify the functionality - make the storage fail (use device-mapper, or pull disks) # dmsetup table .. bz970960-xfs: 0 196608 linear 7:0 2048 # echo 0 196608 error > dmsetup-error-target # dmsetup load bz970960-xfs dmsetup-error-target # dmsetup resume bz970960-xfs # dmsetup table ... bz970960-xfs: 0 196608 error - notice the errors caught by syslog: Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 512 Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): I/O Error Detected. Shutting down filesystem Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s) Jun 24 11:31:49 vm130-32 kernel: VFS:Filesystem freeze failed Jun 24 11:31:50 vm130-32 GlusterFS[1969]: [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down Jun 24 11:32:09 vm130-32 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jun 24 11:32:20 vm130-32 GlusterFS[1969]: [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - these errors are in the log of the brick as well: [2013-06-24 10:31:50.500607] W [posix-helpers.c:1102:posix_health_check_thread_proc] 0-failing_xfs-posix: stat() on /bricks/failing_xfs/data returned: Input/output error [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - the glusterfsd process has exited correctly: # gluster volume status Status of volume: failing_xfs Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick glufs1:/bricks/failing_xfs/data N/A N N/A NFS Server on localhost 2049 Y 1897 Change-Id: Ic247fbefb97f7e861307a5998a9a7a3ecc80aa07 BUG: 971774 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5176 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterfs: discard (hole punch) support	Brian Foster	2013-06-13	1	-27/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	gluster: add fallocate fop support	Brian Foster	2013-06-13	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	dht,posix: support for case discovery	Anand Avati	2013-05-25	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is support for discovering a filename in a given directory which has a case insensitive match of a given name. It is implemented as a virtual extended attribute on the directory where the required filename is specified in the key. E.g: sh# getfattr -e "text" -n user.glusterfs.get_real_filename:FiLe-B /mnt/samba/patchy getfattr: Removing leading '/' from absolute path names # file: mnt/samba/patchy user.glusterfs.get_real_filename:FiLe-B="file-b" In reality, there can be multiple "answers" as the backend filesystem is case sensitive and there can be multiple files which can strcasecamp() successfully. In this case we pick the first matched file from the first responding server. If a matching file does not exist, we return ENOENT (and NOT ENODATA). This way the caller can differentiate between "unsupported" glusterfs API and file not existing. This API is used by Samba VFS to perform efficient discovery of the real filename without doing a full scan at the Samba level. Change-Id: I53054c4067cba69e585fd0bbce004495bc6e39e8 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4941 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	storage/posix: introduce node-uuid-pathinfo	Venky Shankar	2013-04-05	1	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabling this option has an effect on pathinfo xattr request returning <node-uuid>:<path> instead of the default - which is <hostname>:<path>. Change-Id: Ice1b38abf8e5df1568bab6d79ec0d53dfa520332 BUG: 765380 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4567 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	posix: fix dangerous "sharing" of fd in readdir between two requests	Anand Avati	2013-04-03	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	posix_fill_readdir() is a multi-step function which performs many readdir() calls, and expects the directory cursor to have not "seeked away" elsewhere between two successive iterations. Usually this is not a problem as each opendir() from an application has its own backend fd, and there is nobody else to "seek away" the directory cursor. However in case of NFS's use of anonymous fd, the same fd_t is shared between all NFS readdir requests, and two readdir loops can be executing in parallel on the same dir dragging away the cursor in a chaotic manner. The fix in this patch is to lock on the fd around the loop. Another approach could be to reimplement posix_fill_readdir() with a single getdents() call, but that's for another day. Change-Id: Ia42e9c7fbcde43af4c0d08c20cc0f7419b98bd3f BUG: 948086 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4774 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/afr: sync xattrs removed on source to sink(s)	Venky Shankar	2013-04-02	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \|	xattrs are first removed from sink followed by setting source xattrs. Change-Id: I181cb5b785b667bbfc6e40787a2183a8f45de06b BUG: 906646 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4656 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	storage/posix: honor O_SYNC and O_DSYNC sent in @flags of writev()	Anand Avati	2013-03-29	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historic bug - posix_writev() has been inspecting pfd->flushwrites for performing fsync() after write, instead of @flags for O_SYNC\|O_DSYNC. pfd->flushwrites was never set anywhere and is unused completely. This is behavior from the time before anonymous FD where open() had @wbflags param. This is a leftover from that cleanup. Change-Id: Id9bfe562a60db4eb3bd0a7705bdba91f2df2f3ec BUG: 916372 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4738 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	Storage/posix: Don't log at ERROR level for failed getxattr.	Raghavendra Talur	2013-03-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: ENOATTR returned by getxattr -n <NotAnExistingAttribute> <file> was being logged at ERROR level. Solution: Moved logging to DEBUG level. Change-Id: I982a577a4c231faa958ea71abdb272f8d5ffd70c BUG: 918052 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/4628 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Modified validation parameters for owner-uid and owner-gid.	Avra Sengupta	2013-03-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	owner-uid and owner-gid will not receive negative values anymore. Change-Id: I82741d3d01b29e448294b2ec093fb70d22a5c77e BUG: 912297 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4581 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: skip path construction when dentry list is empty	Brian Foster	2013-01-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a minor latency optimization to the readdirp path in storage/posix. During a recursive list, we hit this codepath with an empty list once per high-level directory to read when end of directory is reached. Skip constructing hpath, since we don't do anything with it in this case. BUG: 903175 Change-Id: I98d7c65505205d55575f064b1e982700f1320cc0 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cli: fix descriptions of owner-uid and owner-gid	Shireesh Anjal	2013-01-21	1	-2/+2
\| \| \| \| \| \| \| \| \|	Change-Id: I04c0dd23bc5bc34fd9d7bddb11beeecb8e7e2a49 BUG: 853842 Signed-off-by: Shireesh Anjal <sanjal@redhat.com> Reviewed-on: http://review.gluster.org/4398 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
*	core: remove all the 'inner' functions in codebase	Amar Tumballi	2012-12-19	1	-179/+210
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* move 'dict_keys_join()' from api/glfs_fops.c to libglusterfs/dict.c - also added an argument which is treated as a filter function if required, currently useful for fuse. * now 'make CFLAGS="-std=gnu99 -pedantic" 2>&1 \| grep nested' gives no output. Change-Id: I4e18496fbd93ae1d3942026ef4931889cba015e8 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 875913 Reviewed-on: http://review.gluster.org/4187 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: if create returns EXIST, donot set gfid/xattrs	shishir gowda	2012-12-04	1	-0/+4
\| \| \| \| \| \| \| \| \|	Change-Id: I9f2b75b10bde428d36d6516aa09c18e590d17ed9 BUG: 864801 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4265 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	posix: Fix volume will not start if brick has no volume-id attribute	Venkatesh Somyajulu	2012-11-20	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If the extended attribute (trusted.glusterfs.volume-id) of a brick is absent and <gluster volume start volume-name> command is executed then curretly volume-id from the volume file will be set as an extended attribute of the brick and volume will get started. But if setup is such that brick is used as a mount point and before executing the <gluster volume start volume-name> command, nothing is mounted on the brick then all the file operations will take place at the brick but actual intention of the brick is to be used as mount point only. FIX: Do not start the volume if extended attribute (trusted.glusterfs.volume-id) is set absent. Change-Id: Id2462d87d6087e97e0b8831512fdbc3595f7078b BUG: 860297 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4202 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	storage/posix: Make rchecksum O_DIRECT friendly	Pranith Kumar K	2012-11-20	1	-23/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When posix-aio is enabled to perform aio fd is set with O_DIRECT whenever possible in read, writev fops. Rchecksum does not take this into account. If either offset/size/memory-buf passed to pread in rchecksum fop is not aligned, pread fails with EINVAL. Fix: Before doing pread necessary O_DIRECT manipulation is done when aio is enabled. Memory buffer passed to pread is now page-aligned. Test: 1) Create replica volume with aio enabled. 2) dd if=/dev/urandom of=a bs=1M count=1 3) kill one of the bricks in the replica pair 4) dd if=/dev/urandom of=a bs=1M count=1 5) bring back the brick. Self-heal succeeds after the change. The test above checks both rchecksum, writev fops that were changed in this patch. Change-Id: I186099a2854d4864c5b48086ab7bc5f1a7b27313 BUG: 866459 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4134 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	storage/posix: remove dependency on loc->path in posix_lookup()	Amar Tumballi	2012-10-11	1	-1/+0
\| \| \| \| \| \| \| \| \|	Change-Id: I0a3bc8650d9ff83977be696aa5caf9c7570197fd Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 781318 Reviewed-on: http://review.gluster.org/3997 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	logging: log ENOENT errors in DEBUG mode instead of ERROR or INFO	Raghavendra Bhat	2012-09-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Change-Id: I0a43769223991e4ad5206b4382d737a0c3557bf3 BUG: 851953 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/3934 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	storage/posix: Option to set brick(of a volume)'s root dir's uid/gid	Krishnan Parthasarathi	2012-09-14	1	-5/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CLI --- gluster volume set VOLNAME owner-uid uid gluster volume set VOLNAME owner-gid gid where uid,gid are the owner's user id and group id respectively that would be set on the root of all brick (backend) fs. TODO: uid/gid should not be -1. Today we don't validate that in CLI. Change-Id: Ib6a2fb5e404691c5fe105a89faaeff3e1ab72e91 BUG: 853842 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/3891 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	All: License message change	Varun Shastry	2012-09-13	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \|	License message changed for server-side, dual license GPLV2 and LGPLv3+. Change-Id: Ia9e53061b9d2df3b3ef3bc9778dceff77db46a09 BUG: 852318 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/3940 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: Make posix_fremovexattr anon fd friendly.	Pranith Kumar K	2012-09-06	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: For anonymous fds posix_fremovexattr fails to work because the open never happens and the fd-ctx is not set with the fd-number. Fix: Use posix_fd_ctx_get which opens and sets the fd-number in the fd-ctx for anonymous fds. Tests: Added a syncop call in glustershd to test this change and it worked fine. Change-Id: I9629190a87eb27a7a1578e4fe732a5eb1248f30c BUG: 854331 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/3903 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	posix: adjust new xattrops to new dict API	Csaba Henk	2012-09-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- http://review.gluster.org/3909 introduces new xattrops - http://review.gluster.org/3829 changes the dict API The new xattrops has been written against the old dict API, but been committed after the dict API change, resulting in a build error. Change-Id: I10b9acc79927f3505b5e13116653fb9a584ffd31 BUG: 850917 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/3915 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: Add or_array/and_array op for xattrop	Pranith Kumar K	2012-09-06	1	-1/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: For set/reset of outcast (ALL changelog bits set per transaction type i.e. data/mdata/entry) from afr the capability of OR/AND in xattrop is needed in posix. Otherwise marking outcast will only be possible in self-heals where appropriate locks are held so that no other transaction is in progress, so exact number can be computed with which when XATTROP_ADD happens all bits will be set for that changelog. Fix: Implemented new xattrop-op OR_ARRAY, AND_ARRAY. Made checks in __add_array to work well with __or_array. Tests: From Afr code made an OR_ARRAY with ALL bits set and it reflected on the changelog xattrs. changelog incrementing did not have any effects on the all-set changelog. From Afr code made an AND_ARRAY with 0 and it reflected in the changelog xattrs. Change-Id: Ie89c78a43d05789e3a8fa03d2422b52083ae80b9 BUG: 847671 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/3909 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	libglusterfs/dict: make 'dict_t' a opaque object	Amar Tumballi	2012-09-06	1	-46/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* ie, don't dereference dict_t pointer, instead use APIs everywhere * other than dict_t only 'data_t' should be the valid export from dict.h * added 'dict_foreach_fnmatch()' API * changed dict_lookup() to use data_t, instead of data_pair_t Change-Id: I400bb0dd55519a7c5d2a107e67c8e7a7207228dc Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 850917 Reviewed-on: http://review.gluster.org/3829 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	All: License message change	Varun Shastry	2012-08-28	1	-14/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The license message is changed to Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com> This file is part of GlusterFS. This file is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation. Change-Id: I07d2b63ed5fbbbd1884f1e74f2dd56013d15b0f4 BUG: 852318 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/3858 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: Optimize readdirp calls in DHT	shishir gowda	2012-08-13	1	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bring in option which is supported by posix xlator to filter out directory's entries from being returned. DHT would now request non-first subvols to filter out directory entries. dht xlator-option readdir-optimize will enable this optimization Change-Id: I35224bc81c9657f54f952efac02790276c35ded5 BUG: 838199 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.com/3772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: use the size returned by getxattr to allocate memory	Raghavendra Bhat	2012-07-17	1	-2/+2
\| \| \| \| \| \| \| \| \|	Change-Id: I71c234b12a1d16405e508b715932022fdce346f0 BUG: 838195 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3681 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: use ssize_t variable to get the return value of getxattr	Raghavendra Bhat	2012-07-17	1	-36/+47
\| \| \| \| \| \| \| \| \| \|	Change-Id: Ida065e108a1d2a61b134fb847e8c4981b46fc3c6 BUG: 838195 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3673 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: implement native linux AIO support	Anand Avati	2012-07-14	1	-1/+47
\| \| \| \| \| \| \| \| \| \| \|	Configurable via cli with "storage.linux-aio" settable option Change-Id: I9929e0d6fc1bbc2a0fe1fb67bfc8d15d8a483d3f BUG: 837495 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3627 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
*	remove useless if-before-free (and free-like) functions	Jim Meyering	2012-07-13	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \|	See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: handle getxattr failures gracefully	Raghavendra Bhat	2012-07-11	1	-5/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use proper variable types for getting return value of getxattr calls, which otherwise can lead to segfaulting of processes or page allocation failures in the kernel. Change-Id: I62ab5d6c378447090c19846f03298c3afc8863ba BUG: 838195 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.com/3640 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/stripe: implement the coalesce stripe file format	Brian Foster	2012-06-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The coalesce file format for cluster/stripe condenses the striped files to a contiguous layout. The elimination of holes in striped files eliminates space wasted via local filesystem preallocation heuristics and significantly improves read performance. Coalesce mode is implemented with a new 'coalesce' xlator option, which is user-configurable and disabled by default. The format of newly created files is marked with a new 'stripe-coalesce' xattr. Cluster/stripe handles/preserves the format of files regardless of the current mode of operation (i.e., a volume can simultaneously consist of coalesced and non-coalesced files). Files without the stripe-coalesce attribute are assumed to have the traditional format to provide backward compatibility. extras/stripe-merge: support traditional and coalesce stripe formats Update the stripe-merge recovery tool to handle the traditional and coalesced file formats. The format of the file is detected automatically (and verified) via the stripe-coalesce attributes. BUG: 801887 Change-Id: I682f0b4e819f496ddb68c9a01c4de4688280fdf8 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.com/3282 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: Prevent gfid handle leaks	Pranith Kumar K	2012-05-31	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The case which can lead to gfid handle leaks: Self-heal removes directory '/d' with 10 files in it, in brick b1. This dir is renamed to <landfill>/<hashval of '<brick-path>/d'> by posix. Before the janitor thread could remove the directory, self-heal could remove another directory with same path '/d'. Then again the rename to same path is done by posix as before. The gfid-handles of the old '/d', 10 files in it are not unlinked. To prevent such problems, rename the directory to be removed to <landfill>/<gfid-str>. Change-Id: Iad13708e1ebcc5222b64c058aa9a2d372e1bfa5b BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3159 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: Move landfill inside .glusterfs	Pranith Kumar K	2012-05-31	1	-18/+8
\| \| \| \| \| \| \| \| \|	Change-Id: Ia2944f891dd62e72f3c79678c3a1fed389854a90 BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	storage/posix: readdirp enhancements	Anand Avati	2012-05-28	1	-45/+61
\| \| \| \| \| \| \| \| \| \| \| \|	- avoid multiple calls to posix_istat(). use cheaper posix_pstat() - code re-org Change-Id: I4a2e32626ade49b7d18158952849c6fe7bd6875c BUG: 816140 Reviewed-on: http://review.gluster.com/3460 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>