summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/glusterfs.h
Commit message (Collapse)AuthorAgeFilesLines
* snapshot: Merge conflict resolutionshishir gowda2013-11-151-0/+1
| | | | Signed-off-by: shishir gowda <sgowda@redhat.com>
* bd: Add BD support to other xlatorsM. Mohan Kumar2013-11-131-0/+4
| | | | | | | | | | | | | | | | Make changes to distributed xlator to work with BD xlator. Unlike files, a block device can't be removed when its opened. So some part of the code were moved down to avoid this situation. Also before truncating a BD file its BD_XATTR should be set otherwise truncate will result in truncating posix file. So file is created with needed BD_XATTR and truncate is invoked. Also enables BD xlator in stripe volume type. Change-Id: If127516e261fac5fc5b137e7fe33e100bc92acc0 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5235 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: zerofill supportM. Mohan Kumar2013-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* client_t: phase 2, refactor server_ctx and locks_ctx outKaleb S. KEITHLEY2013-10-311-2/+3
| | | | | | | | | | | | | | remove server_ctx and locks_ctx from client_ctx directly and store as into discrete entities in the scratch_ctx hooking up dump will be in phase 3 BUG: 849630 Change-Id: I94cea328326db236cdfdf306cb381e4d58f58d4c Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/5678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Changes to cli-glusterd communicationKaushal M2013-10-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Glusterd changes: With this patch, glusterd creates a socket file in DATADIR/run/glusterd.socket , and listen on it for cli requests. It listens for 2 rpc programs on the socket file, - The glusterd cli rpc program, for all cli commands - A reduced glusterd handshake program, just for the 'system:: getspec' command The location of the socket file can be changed with the glusterd option 'glusterd-sockfile'. To retain compatibility with the '--remote-host' cli option, glusterd also listens for the cli requests on port 24007. But, for the sake of security, it listens using a reduced cli rpc program on the port. The reduced rpc program only contains read-only procs used for 'volume (info|list|status)', 'peer status' and 'system:: getwd' cli commands. CLI changes: The gluster cli now uses the glusterd socket file for communicating with glusterd by default. A new option '--gluster-sock' has been added to allow specifying the sockfile used to connect. Using the '--remote-host' option will make cli connect to the given host & port. Tests changes: cluster.rc has been modified to make use of socket files and use different log files for each glusterd. Some of the tests using cluster.rc have been fixed. Change-Id: Iaf24bc22f42f8014a5fa300ce37c7fc9b1b92b53 BUG: 980754 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5280 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: [Feature] Command implementation to get heal-countVenkatesh Somyajulu2013-10-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently to know the number of files to be healed, either user has to go to backend and check the number of entries present in indices/xattrop directory. But if a volume consists of large number of bricks, going to each backend and counting the number of entries is a time-taking task. Otherwise user can give gluster volume heal vol-name info command but with this approach if no. of entries are very hugh in the indices/ xattrop directory, it will comsume time. So as a feature, new command is implemented. Command 1: gluster volume heal vn statistics heal-count This command will get the number of entries present in every brick of a volume. The output displays only entries count. Command 2: gluster volume heal vn statistics heal-count replica 192.168.122.1:/home/user/brickname Here if we are concerned with just one replica. So providing any one of the brick of a replica will get the number of entries to be healed for that replica only. Example: Replicate volume with replica count 2. Backend status: -------------- [root@dhcp-0-17 xattrop]# ls -lia | wc -l 1918 NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy entries so actual no. of entries to be healed are 1916. [root@dhcp-0-17 xattrop]# pwd /home/user/2ty/.glusterfs/indices/xattrop Command output: -------------- Gathering count of entries to be healed on volume volume3 has been successful Brick 192.168.122.1:/home/user/22iu Status: Brick is Not connected Entries count is not available Brick 192.168.122.1:/home/user/2ty Number of entries: 1916 Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290 BUG: 1015990 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/6044 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gNFS: Incorrect NFS ACL encoding for XFSSantosh Kumar Pradhan2013-09-291-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Incorrect NFS ACL encoding causes "system.posix_acl_default" setxattr failure on bricks on XFS file system. XFS (potentially others?) doesn't understand when the 0x10 prefix is added to the ACL type field for default ACLs (which the Linux NFS client adds) which causes setfacl()->setxattr() to fail silently. NFS client adds NFS_ACL_DEFAULT(0x1000) for default ACL. FIX: Mask the prefix (added by NFS client) OFF, so the setfacl is not rejected when it hits the FS. Original patch by: "Richard Wareing" Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396 BUG: 1009210 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/5980 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfsd: Round robin DNS should not be relied upon withHarshavardhana2013-09-061-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | config service availability for clients. Backupvolfile server as it stands is slow and prone to errors with mount script and its combination with RRDNS. Instead in theory it should use all the available nodes in 'trusted pool' by default (Right now we don't have a mechanism in place for this) Nevertheless this patch provides a scenario where a list of volfile-server can be provided on command as shown below ----------------------------------------------------------------- $ glusterfs -s server1 .. -s serverN --volfile-id=<volname> \ <mount_point> ----------------------------------------------------------------- OR ----------------------------------------------------------------- $ mount -t glusterfs -obackup-volfile-servers=<server2>: \ <server3>:...:<serverN> <server1>:/<volname> <mount_point> ----------------------------------------------------------------- Here ':' is used as a separator for mount script parsing Now these will be remembered and recursively attempted for fetching vol-file until exhausted. This would ensure that the clients get 'volume' configs in a consistent manner avoiding the need to poll through RRDNS. Change-Id: If808bb8a52e6034c61574cdae3ac4e7e83513a40 BUG: 986429 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5400 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: remove GLUSTERFS_CREATE_MODE_KEY usageAmar Tumballi2013-08-231-1/+0
| | | | | | | | | Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6 BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: changes to support gfid-accessAmar Tumballi2013-08-211-0/+2
| | | | | | | | | Change-Id: I38d2fdc47e4b805deafca6805e54807976ffdb7e Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 952029 Reviewed-on: http://review.gluster.org/5496 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "fuse: auxiliary gfid mount support"Amar Tumballi2013-08-211-2/+0
| | | | | | | | | | | | | | | | | This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20. Conflicts: xlators/mount/fuse/src/fuse-bridge.c For build issues added CREATE_MODE_KEY definition in: libglusterfs/src/glusterfs.h Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: treat appending writes as stable writes.Anand Avati2013-08-131-0/+1
| | | | | | | | | | | | | Durability of appending writes is implicit in the file size. Therefore performing an explicit fsync() is unnecessary in such cases as self-heal can check for the size of file when pending changelog is not unambiguous. Change-Id: I05446180a91d20e0dbee5de5a7085b87d57f178a BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs/client_t client_t implementation, phase 1Kaleb S. KEITHLEY2013-07-291-47/+49
| | | | | | | | | | | | | | | | | | | | | | | | Implementation of client_t The feature page for client_t is at http://www.gluster.org/community/documentation/index.php/Planning34/client_t In addition to adding libglusterfs/client_t.[ch] it also extracts/moves the locktable functionality from xlators/protocol/server to libglusterfs, where it is used; thus it may now be shared by other xlators too. This patch is large as it is. Hooking up the state dump is left to do in phase 2 of this patch set. (N.B. this change/patch-set supercedes previous change 3689, which was corrupted during a rebase. That change will be abandoned.) BUG: 849630 Change-Id: I1433743190630a6d8119a72b81439c0c4c990340 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/3957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* cluster/afr: Handle REPLICATE_TRASH_DIR from old bricksPranith Kumar K2013-07-261-0/+1
| | | | | | | | | Change-Id: Ib99f79d3fa607c818dbc62006516480f598d8add BUG: 886998 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4640 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: increase the auxillary group limit to 65536Anand Avati2013-07-241-1/+1
| | | | | | | | | | | | | Make the allocation of groups dynamic and increase the limit to 65536. Change-Id: I702364ff460e3a982e44ccbcb3e337cac9c2df51 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5111 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* fuse: auxiliary gfid mount supportRaghavendra G2013-07-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * files can be accessed directly through their gfid and not just through their paths. For eg., if the gfid of a file is f3142503-c75e-45b1-b92a-463cf4c01f99, that file can be accessed using <gluster-mount>/.gfid/f3142503-c75e-45b1-b92a-463cf4c01f99 .gfid is a virtual directory used to seperate out the namespace for accessing files through gfid. This way, we do not conflict with filenames which can be qualified as uuids. * A new file/directory/symlink can be created with a pre-specified gfid. A setxattr done on parent directory with fuse_auxgfid_newfile_args_t initialized with appropriate fields as value to key "glusterfs.gfid.newfile" results in the entry <parent>/bname whose gfid is set to args.gfid. The contents of the structure should be in network byte order. struct auxfuse_symlink_in { char linkpath[]; /* linkpath is a null terminated string */ } __attribute__ ((__packed__)); struct auxfuse_mknod_in { unsigned int mode; unsigned int rdev; unsigned int umask; } __attribute__ ((__packed__)); struct auxfuse_mkdir_in { unsigned int mode; unsigned int umask; } __attribute__ ((__packed__)); typedef struct { unsigned int uid; unsigned int gid; char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid string * in canonical form. */ unsigned int st_mode; char bname[]; /* bname is a null terminated string */ union { struct auxfuse_mkdir_in mkdir; struct auxfuse_mknod_in mknod; struct auxfuse_symlink_in symlink; } __attribute__ ((__packed__)) args; } __attribute__ ((__packed__)) fuse_auxgfid_newfile_args_t; An initial consumer of this feature would be geo-replication to create files on slave mount with same gfids as that on master. It will also help gsyncd to access files directly through their gfids. gsyncd in its newer version will be consuming a changelog (of master) containing operations on gfids and sync corresponding files to slave. * Also, bring in support to heal gfids with a specific value. fuse-bridge sends across a gfid during a lookup, which storage translators assign to an inode (file/directory etc) if there is no gfid associated it. This patch brings in support to specify that gfid value from an application, instead of relying on random gfid generated by fuse-bridge. gfids can be healed through setxattr interface. setxattr should be done on parent directory. The key used is "glusterfs.gfid.heal" and the value should be the following structure whose contents should be in network byte order. typedef struct { char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid * string in canonical form */ char bname[]; /* a null terminated basename */ } __attribute__((__packed__)) fuse_auxgfid_heal_args_t; This feature can be used for upgrading older geo-rep setups where gfids of files are different on master and slave to newer setups where they should be same. One can delete gfids on slave using setxattr -x and .glusterfs and issue stat on all the files with gfids from master. Thanks to "Amar Tumballi" <amarts@redhat.com> and "Csaba Henk" <csaba@redhat.com> for their inputs. Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ie8ddc0fb3732732315c7ec49eab850c16d905e4e BUG: 952029 Reviewed-on: http://review.gluster.com/#/c/4702 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/4702 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* include <limits.h> for PATH_MAXEmmanuel Dreyfus2013-07-181-0/+1
| | | | | | | | | | | | | | I need to include <limits.h> in order to use PATH_MAX, Otherwise it will not build at mine. I believe it is standard compliant to do so: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html BUG: 764655 Change-Id: I3f124466f7f7742e94a9d1256bc9239ec16aab04 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/5340 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* locks: Added an xdata-based 'cmd' for inodelk count in a given domainshishir gowda2013-07-181-0/+1
| | | | | | | | | | | | | | | | | | | | Following is the semantics of the 'cmd': 1) If @domain is NULL - returns no. of locks blocked/granted in all domains 2) If @domain is non-NULL- returns no. of locks blocked/granted in that domain 3) If @domain is non-existent - returns '0'; This is important since locks xlator creates a domain in a lazy manner. where @domain - a string representing the domain. Change-Id: I5e609772343acc157ca650300618c1161efbe72d BUG: 951195 Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4889 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* cluster/*: get logic to calculate min() of the 'stime' xattrAvra Sengupta2013-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | * in both distribute and replicate (ignoring stripe for now), add logic to calculate the min() of stime values. * What is a 'stime' ? Why is this required: - stime means 'slave xtime', mainly used to keep track of slave node's sync status when distributed geo-replication is used. Logic of calculating 'min()' for this stime is very important as in case of crashes/reboots/shutdown, we will have to 'restart' with crawling from stime time value from the mount point, which gives the 'min()' of all the bricks, which means, we don't miss syncing any files in the above cases. Change-Id: I2be8d434326572be9d4986db665570a6181db1ee BUG: 847839 Original Author: Amar Tumballi <amarts@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4893 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: Provide option to use/not use kernel-readdirpPranith Kumar K2013-07-121-0/+1
| | | | | | | | | | | | | | By default fuse kernel readdirp usage in fuse xlator is off. When mount option use-readdirp=yes is provided it starts using fuse-kernel's readdirp. Change-Id: Id37edc53b1adc1638186d956c2f74c1e4e48aa59 BUG: 983477 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5322 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: expose 'glusterfs.gfid*' virtual xattr keyAvra Sengupta2013-07-111-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | currently two keys are exposed: 'glusterfs.gfid' : output is 16byte binary gfid 'glusterfs.gfid.string' : output is 36 byte canonical format of gfid e.g. [root@supernova glusterfs]# getfattr -n glusterfs.gfid -e hex f0 glusterfs.gfid=0x68305acb73e541719804fcf36a4857e8 [root@supernova glusterfs]# getfattr -n glusterfs.gfid.string f0 glusterfs.gfid.string="68305acb-73e5-4171-9804-fcf36a4857e8" early consumers for this key would be geo-replication (as it has being designed to do namespace operations on gfid from the mount point, thereby needing the GFID for entry operations on the slave). Change-Id: I10b23dbd11628566ad6924334253f5d85d01a519 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5129 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: discard (hole punch) supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gluster: add fallocate fop supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* dht,posix: support for case discoveryAnand Avati2013-05-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is support for discovering a filename in a given directory which has a case insensitive match of a given name. It is implemented as a virtual extended attribute on the directory where the required filename is specified in the key. E.g: sh# getfattr -e "text" -n user.glusterfs.get_real_filename:FiLe-B /mnt/samba/patchy getfattr: Removing leading '/' from absolute path names # file: mnt/samba/patchy user.glusterfs.get_real_filename:FiLe-B="file-b" In reality, there can be multiple "answers" as the backend filesystem is case sensitive and there can be multiple files which can strcasecamp() successfully. In this case we pick the first matched file from the first responding server. If a matching file does not exist, we return ENOENT (and NOT ENODATA). This way the caller can differentiate between "unsupported" glusterfs API and file not existing. This API is used by Samba VFS to perform efficient discovery of the real filename without doing a full scan at the Samba level. Change-Id: I53054c4067cba69e585fd0bbce004495bc6e39e8 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4941 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterfs : Moved option files, and statedumps from /tmpAvra Sengupta2013-01-291-0/+2
| | | | | | | | | Change-Id: Ibdede396c4d6859225937316b7a59a661bcaf9f5 BUG: 764890 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4422 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd, cli: Task id's for async tasksKaushal M2012-12-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | This patch introduces task-id's for async tasks like rebalance, remove-brick and replace-brick. An id is generated for each task when it is started and displayed to the user in cli output. The status of running tasks is also included in the output of "volume status" along with its id, so that a user can easily track the progress of an async task. Also, * added tests for this feature into the regression test suite. * added a python script for creating files, 'create-files.py', courtesy Vijaykumar Koppad (vkoppad@redhat.com) into the test suite. This patch reverts the revert commit 698deb33d731df6de84da8ae8ee4045e1543a168. BUG: 857330 Change-Id: Id43d7cb629a38f47f733fbc18cb4c5f2f0327c7a Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "glusterd, cli: Task id's for async tasks"Anand Avati2012-12-041-4/+0
| | | | | | | | | | | This reverts commit ed15521d4e5af2b52b78fd33711e7562f5273bc6 Strangely, the test scripts are "silently" passing for failures too. Reverting patch for now. Change-Id: I802ec1634c7863dc373cc7dc4a47bd4baa72764e Reviewed-on: http://review.gluster.org/4267 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd, cli: Task id's for async tasksKaushal M2012-12-041-0/+4
| | | | | | | | | | | | | | | | | | | | This patch introduces task-id's for async tasks like rebalance, remove-brick and replace-brick. An id is generated for each task when it is started and displayed to the user in cli output. The status of running tasks is also included in the output of "volume status" along with its id, so that a user can easily track the progress of an async task. Also, * added tests for this feature into the regression test suite. * added a python script for creating files, 'create-files.py', courtesy Vijaykumar Koppad (vkoppad@redhat.com) into the test suite. Change-Id: Ib0c0d12e0d6c8f72ace48d303d7ff3102157e876 BUG: 857330 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/3942 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/locks: implement fgetxattr and fsetxattrRaghavendra G2012-11-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | implement xattrs for GF_XATTR_LOCKINFO_KEY, which will be used for posix-locks migration from old to new graph after a switch. fgetxattr (fd, GF_XATTR_LOCKINFO_KEY) will return a dict. This dict has a serialized dict stored for key GF_XATTR_LOCKINFO_KEY. This serialized dict in turn has fdnum value of locks acquired on this fd with modified pathinfo (containing hostname and base directory components) as key. fsetxattr (newfd, GF_XATTR_LOCKINFO_KEY, dict) has following semantics. dict can be the result of a previous fgetxattr with GF_XATTR_LOCKINFO_KEY. In that case, a dict_get on dict constructed using serialized buffer is done on modified pathinfo as key. If a value is got, that value is treated as fdnum and for every lock l on newfd->inode we do, if (l->fdnum == fdnum) { l->fdnum = fd_fdnum (newfd); l->transport = <connection identifier of connection on which fsetxattr came>; } Signed-off-by: Raghavendra G <raghavendra@gluster.com> Change-Id: I73a8f43aa0b6077bc19f8de52205ba748f2d8bbe BUG: 808400 Reviewed-on: http://review.gluster.org/4120 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/stripe: handle GF_XATTR_LOCKINFO_KEY in f(get)(set)xattrRaghavendra2012-11-271-0/+2
| | | | | | | | | Change-Id: I4463006a7f54c05e757d877c56e1330fd91aec45 BUG: 808400 Signed-off-by: Raghavendra <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4125 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: enable process to return the appropriate error codeAnand Avati2012-09-211-0/+1
| | | | | | | | | | | | Setup a pipe() in glusterfs_ctx_t to communicate with the fork'ed child the exit status and return it to the shell. Change-Id: Ibbaa581d45acb24d5141e43ae848cae29312d95f BUG: 762935 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/3836 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* fuse-bridge: Pass unknown option down to fuseLubomir Rintel2012-09-121-0/+2
| | | | | | | | | | | | | | | | | | In Linux, certain "filesystem-specific" options (passed in string form in last argument to mount(2)), such as "rootcontext" or "context" are in fact common to all filesystems, including fuse. We should pass them down to the actual mount(2) call untouched. This is achieved by adding "fuse-mountopts" option to mount/fuse translator and adjusting the mount helper to propagate it with unrecognized options as they are encountered. BUG: 852754 Change-Id: I309203090c02025334561be235864d8d04e4159b Signed-off-by: Lubomir Rintel <lubo.rintel@gooddata.com> Reviewed-on: http://review.gluster.org/3871 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount/fuse: add mount-option "enable-ino32" for the native clientNiels de Vos2012-09-061-0/+1
| | | | | | | | | | | | | | By default the GlusterFS-native client uses 64-bit inodes. Some 32-bit applications can not handle these correctly. Introduce a client-side mount option "enable-ino32" which causes the FUSE-client to squash the 64-bit inodes into a 32-bit value. Change-Id: I3296d16528bfb50457b9675f6b8701234ed82ff0 BUG: 850352 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/3885 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: Add or_array/and_array op for xattropPranith Kumar K2012-09-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: For set/reset of outcast (ALL changelog bits set per transaction type i.e. data/mdata/entry) from afr the capability of OR/AND in xattrop is needed in posix. Otherwise marking outcast will only be possible in self-heals where appropriate locks are held so that no other transaction is in progress, so exact number can be computed with which when XATTROP_ADD happens all bits will be set for that changelog. Fix: Implemented new xattrop-op OR_ARRAY, AND_ARRAY. Made checks in __add_array to work well with __or_array. Tests: From Afr code made an OR_ARRAY with ALL bits set and it reflected on the changelog xattrs. changelog incrementing did not have any effects on the all-set changelog. From Afr code made an AND_ARRAY with 0 and it reflected in the changelog xattrs. Change-Id: Ie89c78a43d05789e3a8fa03d2422b52083ae80b9 BUG: 847671 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/3909 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse: make background queue length configurableAmar Tumballi2012-08-221-1/+2
| | | | | | | | | | | | | | | | * also make 'congestion_threshold' an option * make 'congestion_threshold' as 75% of background queue length if not explicitely specified * in glusterfsd.c, moved all the fuse option dictionary setting code to separate function Change-Id: Ie1680eefaed9377720770a09222282321bd4132e Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 845214 Reviewed-on: http://review.gluster.org/3830 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Optimize readdirp calls in DHTshishir gowda2012-08-131-0/+2
| | | | | | | | | | | | | | | | | Bring in option which is supported by posix xlator to filter out directory's entries from being returned. DHT would now request non-first subvols to filter out directory entries. dht xlator-option readdir-optimize will enable this optimization Change-Id: I35224bc81c9657f54f952efac02790276c35ded5 BUG: 838199 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.com/3772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: moved back the pthread_key_t specific variables as globalAmar Tumballi2012-08-061-6/+0
| | | | | | | | | | | | | in a patch to move all the global variables inside 'ctx', moved all the pthread_key_t specific globals, which needed to be global, not inside some structures. Change-Id: I5e7107a8a64f5b80e90fd469fb084f62b2312705 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 764890 Reviewed-on: http://review.gluster.com/3783 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: reduce the usage of global variablesAmar Tumballi2012-08-031-9/+13
| | | | | | | | | | | | | | | | | | | | | | | * move all the 'logging' related global variables into ctx * make gf_fop_list a 'const' global array, hence no init(), no edits. * make sure ctx is allocated without any dependancy on memory-accounting infrastructure, so it can be the first one to get allocated * globals_init() should happen with ctx as argument not yet fixed below in this patchset: * anything with 'THIS' related globals * anything related to compat_errno related globals as its one time init'd and not changed later on. * statedump related globals Change-Id: Iab8fc30d4bfdbded6741d66ff1ed670fdc7b7ad2 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 764890 Reviewed-on: http://review.gluster.com/3767 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* acl: enable handling of FMODE_EXEC flagAmar Tumballi2012-07-271-1/+7
| | | | | | | | | | | | | | | | | | | | | | on linux systems, with open(), we can get below flag as per 'linux/fs.h'. /* File is opened for execution with sys_execve / sys_uselib */ Instead of adding '#include <linux/fs.h>, its better to copy this absolute number into other variable because then we have to deal with declaring fmode_t etc etc.. With the fix, we can handle the file with '0711' permissions in the same way as backend linux filesystems. Change-Id: Ib1097fc0d2502af89c92d561eb4123cba15713f5 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.com/3739 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse-bridge: expose negative entry caching of FUSEAnand Avati2012-07-191-0/+2
| | | | | | | | | | | | | | | | Fuse kernel module supports caching negative entries, enabled by specifying a timeout while returning ENOENT to lookup. This patch enables the functionality to be enabled with the command line. Also fixed a typo bug in mount.glusterfs.in. Change-Id: I47eab2834cca9a05887266358afbf504bbb4c489 BUG: 841417 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3696 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* glusterfs_ctx_t: un-globalize the filesystem contextAnand Avati2012-07-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | So far there has been a global glusterfs_ctx_t object which represents the running instance of the filesystem (client or server). It contains the various graphs, connection to the management daemon over which new graphs are obtained, calls stacks issued on this filesystem, and a bunch of such things. With the introduction of libgfapi, it is no more true that there will be only one filesystem context in a process. Applications can be written to use libgfapi and obtain serveral instances of different filesystems/volumes in the same process. This involves messy untangling of assumptions inside libglusterfs that there would only be one global glusterfs_ctx_t and offload that assumption to glusterfsd/ and cli/ (where it is true). Change-Id: Ifd7d1259428c26076140a5764a2dc7361694139c BUG: 839950 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* libglusterfs,mount/fuse: implement gidcache mechanism in fuse-bridgeBrian Foster2012-07-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This change genericizes the cache mechanism implemented in commit 8efd2845 into libglusterfs/src/gidcache.[ch] and adds fuse-bridge as a client. The cache mechanism is fundamentally equivalent, with some minor changes: - Change cache key from uid_t to uint64_t. - Modify the cache add logic to locate and use an entry with a matching ID, should it already exist. This addresses a bug in the existing mechanism where an expired entry supercedes a newly added entry in lookup, causing repeated adds and flushing of a cache bucket. The fuse group cache is disabled by default. It can be enabled via the 'gid-timeout' fuse-bridge translator option and accompanying mount option (i.e., '-o gid-timeout=1' for a 1s entry timeout). BUG: 800892 Change-Id: I0b34a2263ca48dbb154790a4a44fc70b733e9114 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.com/3676 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse/md-cache: add support for the 'fopen-keep-cache' mount optionBrian Foster2012-07-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | fopen-keep-cache disables unconditional page-cache invalidations on file open in fuse (via FOPEN_KEEP_CACHE) and replaces that behavior with detection of remote changes and explicit invalidations from mount/fuse. This option improves local caching through the page cache and native client. This change defines a new 'invalidate' translator callback to identify when an inode's cache mapping has been determined to be invalid. md-cache implements the policy to detect and invoke inode invalidations. fuse-bridge and io-cache implement invalidate handlers to invalidate the respective caches (page cache in the case of fuse). BUG: 833564 Change-Id: I99818da5777eaf06276c1c0b194669f5bab92d48 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.com/3584 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Remove dht dependency on glusterfsd-mgmtshishir gowda2012-06-291-0/+4
| | | | | | | | | | | | | | | glusterfs_ctx->notify can be used by any xlator to talk to glusterfsd-mgmt. Note- This is for any rpc communication initiated by the xlator, and not from glusterd. Change-Id: Ic0e4af106fe1e98d797ca621facda8839b87598a BUG: 835757 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.com/3618 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: Move landfill inside .glusterfsPranith Kumar K2012-05-311-6/+0
| | | | | | | | | Change-Id: Ia2944f891dd62e72f3c79678c3a1fed389854a90 BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse: make SELinux support configurableAnand Avati2012-05-291-0/+1
| | | | | | | | | | | | | | Make support for SELinux labels (extended attributes) configurable and disabled by default as it can cause significant performance penalty when enabled (it need not be enabled unless specially crafted policies are set -- which is not by default) Change-Id: I97bc4b1c26cf055fd520e9bf2d49e52b14fe7515 BUG: 811217 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3484 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* fuse, glusterfsd: mount logic fixesCsaba Henk2012-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7d0397c2 introduced two issues: i) broke the libfuse derived mount logic (details below) ii) in case of a daemonized glusterfs client is ran as daemon, parent process can return earlier than the mount is in place, which breaks agents that programmatically do a gluster mount via a direct call to glusterfs (ie. not via mount(8)). This patch fixes these issues by a refactor that merges the approaches sported by commits 7d0397c2 fuse: allow requests during mount (needed for SELinux labels) c5d781e0 upon daemonizing, wait on mtab update to terminate in parent Original daemonized libfuse event flow is as follows: try: fd = open("/dev/fuse") mount("-oopts,fd=%s" % fd ...) mount(8) -f # manipulate mtab except: sp = socketpair() env _FUSE_COMMFD=sp fusermount -oopts fd = receive_fd(sp) where fusermount(1) does: fd = open("/dev/fuse") mount("-oopts,fd=%d" % fd ...) sp = atoi(getenv("_FUSE_COMMFD")) send_fd(sp, fd) daemonize( # in child fuse_loop(fd) ) # in parent exit() As of 013850c9 (instead of adopting FUSE's 47e61004¹), we went for async mtab manipulation, and as of c5d781e0, still wanted keep that in sync with termination of daemon parent, so we changed it to: try: fd = open("/dev/fuse") mount("-oopts,fd=%s" % fd ...) pid = fork( # in child mount(8) -f ) except: sp = socketpair() env _FUSE_COMMFD=sp fusermount -oopts fd = receive_fd(sp) daemonize( fuse_loop(fd) ) waitpid(pid) exit() (Note the new approch came only to direct [privileged] mount, so fusermount based mounting was already partially broken.) As of 7d0397c2, with the purpose of facilitating async mount, the event flow was practically reduced to: fd = open("/dev/fuse") fork( mount("-oopts,fd=%s" % fd ...) fork( mount(8) -n ) ) daemonize( fuse_loop(fd) ) exit() Thus fusermount based mounting become defunct; however, the dead code was still kept around. So, we should either drop it or fix it. Also, the mtab manipulator is forked into yet another child with no purpose, while syncing with it in daemon parent is broken. mount(2) is neither synced with parent. Now we are coming to the following scheme: fd = open("/dev/fuse") pid = fork( try: mount("-oopts,fd=%s" % fd ...) mount(8) -n except: env _FUSE_DEVFD=fd fusermount -oopts ) where fusermount(1) does: fd = getenv("_FUSE_DEVFD") mount("-oopts,fd=%s" % fd ...) daemonize( fuse_loop(fd) ) waitpid(pid) exit() Nb.: - We can't help losing compatibility with upstream fusermount, as it sends back the fd only when mount(2) is completed, thus defeating the async mount approach. The 'getenv("_FUSE_DEVFD")' mechanism is specfic to glusterfs' fusermount (at the moment -- sure we can talk about it with upstream) - fusermount opens /dev/fuse at same privilege level as of original process², so we can bravely go on with doing the open unconditionally in original process - Original mounting code actually tries to mount through fusermount _twice_: if first attempt fails, then, assuming subtype support is missing in kernel, it tries again subtype stripped. However, this is redundant, as fusermount internally also performs the subtype check³. Therefore we simplified the logic to have just a single fusermount call. - we revert the changes to mount.glusterfs as of 7d0397c2, as now there is no issue with glusterfs to work around in that scope ¹ http://fuse.git.sourceforge.net/git/gitweb.cgi?p=fuse/fuse;a=blobdiff;f=ChangeLog;h=47e61004;hb=4c3d9b19;hpb=e61b775a ² http://fuse.git.sourceforge.net/git/gitweb.cgi?p=fuse/fuse;a=blob;f=util/fusermount.c;h=b2e87d95#l1023 ³ http://fuse.git.sourceforge.net/git/gitweb.cgi?p=fuse/fuse;a=blob;f=util/fusermount.c;h=b2e87d95#l839 Change-Id: I0c4ab70e0c5ad7b27337228749b266bcd0ba941d BUG: 811217 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3341 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Fixed glusterd_brick_create_path algo.Krishnan Parthasarathi2012-05-161-0/+1
| | | | | | | | | | | | | | | | | | | | | - check if any prefix of the brick path has "trusted.gfid" or "trusted.glusterfs.volume-id" set. - set trusted.glusterfs.volume-id on the bricks as soon as its induction into the volume is settled. Earlier, the setting of "volume-id" used to happen during the first run of the brick process, leaving of window for bricks part of one volume to be (ab)used by another volume inadvertently. - removed creation of brick directory (if missing), during start volume force. This is to avoid directory creation as part 'force'ful starting of volume and leave the responsibility with the user, who understands the 'availability' of the export directory (brick) better. Change-Id: I4237ec4ea7a4e38a7501027e7de7112edd67de8c BUG: 812214 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/3280 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>