summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr
Commit message (Collapse)AuthorAgeFilesLines
* core: adding extra data for fopsAmar Tumballi2012-03-2220-589/+696
| | | | | | | | | | | | | with this change, the xlator APIs will have a dictionary as extra argument, which is passed between all the layers. This can be utilized for overloading in some of the operations. Change-Id: I58a8186b3ef647650280e63f3e5e9b9de7827b40 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2960 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Disabled self-heal on clear-locks internal mountKrishnan Parthasarathi2012-03-211-5/+5
| | | | | | | | | | | | | | - Also, changed afr_get_xattr_clrlk_cbk to use dict_set_dynstr for clear-lock summary. Earlier, it was relying on 'str' passed from xlators below. Change-Id: I175f4542e6ef2c859c4811eecb9d8c5a7d25a283 BUG: 800779 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2992 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* replicate: fix a glitch in up_count/down_count updates.Jeff Darcy2012-03-191-2/+24
| | | | | | | | | Change-Id: I4919a98191bf7fe5edad9a149a129bcd177cd4a8 BUG: 802522 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/2927 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Logs: Improved logs in lock/unlock execution pathPranith Kumar K2012-03-181-1/+1
| | | | | | | | | | | Statedump will now start showing the lk-owner of the stack. Change-Id: I9f650ce9a8b528cd626c8bb595c1bd1050462c86 BUG: 803209 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2968 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Self-heald: Handle errors gracefully and show errors to usersPranith Kumar K2012-03-185-106/+196
| | | | | | | | | Change-Id: I5424ebfadb5b2773ee6f7370cc2867a555aa48dd BUG: 800352 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2962 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: set_read_child when xactions in progress in fresh lookupPranith Kumar K2012-03-182-3/+7
| | | | | | | | | Change-Id: I33e0268635ae7a1f247b0052994e027f990083da BUG: 800755 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2963 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: Copy loc->gfid independent of lookup being fresh or otherwiseKrishnan Parthasarathi2012-03-181-5/+3
| | | | | | | | | | | | | This change ensures that entry self-heal following a lookup on that entry would have loc->gfid 'filled'. Change-Id: If723c71ca43e1f062dcb99cbe5488342514dace0 BUG: 786087 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2950 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Enable eager-lockPranith Kumar K2012-03-177-104/+146
| | | | | | | | | | | | | | | | | | Eager-lock is disabled by default. Use cluster.eager-lock on/off to change the config. write-behind on and eager-lock off is not supported configuration. In afr, when eager-lock is enabled the inode lock on fd is taken using the fd address as the lk-owner. So the lock is interchangableale between the inode-locks on the same fd. Change-Id: I7eef1ecd510f8028f5395dee882782da53c0de3f BUG: 802515 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2925 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: Corrected getxattr 'key' matching in case of clrlk cmdKrishnan Parthasarathi2012-03-142-10/+5
| | | | | | | | | | | - Added local->dict cleanup into afr_local_cleanup Change-Id: Ie1b96615735a9d2a2be1757cd016dbe225aae31c BUG: 800412 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2922 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: handle sending NULL dentry name for inode link in self-heal-daemonRaghavendra Bhat2012-03-142-2/+11
| | | | | | | | | | | | | | | | | | | | | * Without the dentry name, dentry cannot be created in inode_link, which leads to trying to access the null dentry to check if it is cyclic and thus segfault. So send the parent inode also NULL, which just returns the proper inode after assigning the gfid and type to the inode without trying to create dentry. * Handle failures such as dentry_create returning NULL, in inode_link properly and return NULL in such cases. * Increase the lru limit of inode table of self-heal-daemon to 2048 Change-Id: I7ae0e0e9be279d1694b6aafb5e054585e43f03ff BUG: 801149 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2893 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: save the xattr obtained in the {f}xattrop_cbk in localRaghavendra Bhat2012-03-122-5/+27
| | | | | | | | | | | | | | | | | | | If the {f}xattrop operation succeeds on one of the subvolumes and fails on another (thus the xattr dict obtained from the failed subvolume in the callback will be NULL), then afr would be unwinding with op_ret = 0 (since the operation was successful on one subvolume), but the xattr dict would be NULL (afr is not saving the xattr it has received in the callback in its local structure and will send the xattr it has received in the last callback). xlators above afr might segfault when they access the xattr since they would have assumed that xattr would be present as op_ret is 0. Change-Id: I50761a302150285f31dfdaa397f890c9370a989a BUG: 797119 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2813 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: handle node failures in lookupPranith Kumar K2012-03-052-2/+29
| | | | | | | | | | | | | | When a transaction is in progress lookup depends on inode ctx for read-child. If the lookup fails on the read-child while another transaction is in progress, it should select the read-child as the next success_child which is in fresh_children. Change-Id: I33a04b102966b63a64bacf8d2e29f0d0119fdac6 BUG: 773225 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2858 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Reset re-usable sh args in sh_*_donePranith Kumar K2012-03-054-27/+28
| | | | | | | | | | | | The bug is observed due to stale value of active_sink count set in metadata self-heal. Change-Id: I41996999213c323c0f4d4db575d87b2d0b4b3fec BUG: 798874 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2849 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* fops/removexattr: prevent users from removing glusterfs xattrsRajesh Amaravathi2012-03-052-10/+37
| | | | | | | | | | | | | | | | | | | | | | | * Each xlator prevents the user from removing xlator-specific xattrs like trusted.gfid by handling it in respective removexattr functions. * For xlators which did not define remove and fremovexattr, the functions have been implemented with appropriate checks. xlator | fops-added _______________|__________________________ | 1. stripe | removexattr and fremovexattr 2. quota | removexattr and fremovexattr Change-Id: I98e22109717978134378bc75b2eca83fefb2abba BUG: 783525 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2836 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Revert "afr: [Un]Set the 'right' lkowner for [f]{inode|entry}_lk and the ↵Vijay Bellur2012-03-036-94/+22
| | | | | | | | | | | 'enclosed' fop." This reverts commit 2e80fdbeb6abbb23ff6789c2b98c82704883af0a. Change-Id: I417fd43e4195d63e5b8b83dd3beb712887130e1e Reviewed-on: http://review.gluster.com/2860 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: copy the parent's gfid from child loc while building parent locRaghavendra Bhat2012-03-021-0/+2
| | | | | | | | | | | | | | | | | | | | Suppose the process is not a fuse or nfs mounted client, and some other process such as rebalance, then after lookups inode would not be linked to the inode table (since the inode was created for rebalance purpose only), thus keeping inode's gfid NULL. And afr while building the parent loc using child loc, does not copy the pargfid present in child'd loc structure. protocol/client will search for the gfid either in loc or in loc->inode and assert if it cannot find the gfid in either of them. Change-Id: I882e449fb8b79d5c69e4a942abcd844dc4d5d30c BUG: 799262 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2857 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
* cluster/afr: Add new option to know which process it is inPranith Kumar K2012-03-014-5/+11
| | | | | | | | | | | | | | | Afr xl needs to maintain inode-table inside the xl if it is in self-heal-daemon. The code was depending on the option self-heal-daemon to do this. This is wrong as the option can be reconfigured to on/off. Added a new option which can't be reconfigured for this purpose. Change-Id: Idc42c403c4bd9b73d1f328427ae4158ff1420b3a BUG: 795741 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2787 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle errors in build_parent_locPranith Kumar K2012-03-015-50/+81
| | | | | | | | | BUG: 787671 Change-Id: I0b01b0f9e14a26d757748413dd71909e915c7573 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2826 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* afr: [Un]Set the 'right' lkowner for [f]{inode|entry}_lk and the 'enclosed' fop.Krishnan Parthasarathi2012-03-016-22/+94
| | | | | | | | | | | | | | afr 'mangles' the lkowner inorder to ensure [f]inodelk/[f]entrylk fops from the same application contend. But other fops that are 'visible' to the application should operate with the lkowner provided by fuse for correct functioning of posix-locks xlator. Change-Id: I7e71f35ae7df2a070f1f46d4fc77eed26a717673 BUG: 790743 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2752 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Hardlink Self-healPranith Kumar K2012-03-011-11/+172
| | | | | | | | | | Change-Id: Iea0b38011edbc7fc9d75a95d1775139a5234e514 BUG: 765391 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2841 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Introduce new extended attribute: node-uuidVenky Shankar2012-02-222-36/+105
| | | | | | | | | | | | | | | | Request for trusted.glusterfs.node-uuid returns pathinfo like string but containing the UUID of glusterd instead of the backend path for the requested file. This info is benificial for tasks like parallel rebalance that will make use of the UUID for data locality. Change-Id: I766a09cc4a5f63aebd11c73107924a1b29242dcf BUG: 772610 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/2614 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* mempool: adjustments in pool sizesAmar Tumballi2012-02-222-2/+2
| | | | | | | | | | | | | | | | | * while creating 'rpc_clnt', the caller knows what would be the ideal load on it, so an extra argument to set some pool sizes * while creating 'rpcsvc', the caller knows what would be the ideal load of it, so an extra argument to set request pool size * cli memory footprint is reduced Change-Id: Ie245216525b450e3373ef55b654b4cd30741347f Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2784 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Don't trust the fd returned in open_cbkPranith Kumar K2012-02-221-4/+7
| | | | | | | | | Change-Id: Id7d85a38875e3675904fc134e54e723c6a0c4de2 BUG: 786766 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2792 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* pump: Fixed undefined reference to fill_loc_info fn.Krishnan Parthasarathi2012-02-222-4/+8
| | | | | | | | | | | | | Changed function to pump_fill_loc_info since its use is relevant only in pump's context. Change-Id: I5be1a908f88328f732dacfd7eac18f0c62f49eb8 BUG: 796066 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2796 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-2113-73/+98
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Add commands to see self-heald opsPranith Kumar K2012-02-208-258/+690
| | | | | | | | | Change-Id: Id92d3276e65a6c0fe61ab328b58b3954ae116c74 BUG: 763820 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2775 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Self-heald, Index integrationPranith Kumar K2012-02-2010-204/+467
| | | | | | | | | Change-Id: Ic68eb00b356a6ee3cb88fe2bde50374be7a64ba3 BUG: 763820 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2749 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* locks: Added a getxattr interface to clear locks on a given inode.Krishnan Parthasarathi2012-02-142-30/+172
| | | | | | | | | | | | | | | | getxattr returns a summary of no. of inodelks/entrylks cleared. cmd_structure: trusted.glusterfs.clrlk.t<type>.k<kind>[.{range|basename}] where, type = "inode"| "entry"| "posix" kind = "granted"| "blocked" | "all" range = off,a-b, where a, b = 'start', 'len' from offset 'off' Change-Id: I8a771530531030a9d4268643bc6823786ccb51f2 BUG: 789858 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2551 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: add an extra flag to readv()/writev() APIAmar Tumballi2012-02-147-12/+18
| | | | | | | | | | | | needed to implement a proper handling of open flag alterations using fcntl() on fd. Change-Id: Ic280d5db6f1dc0418d5c439abb8db1d3ac21ced0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2723 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Perform xattrop with all afr-keysPranith Kumar K2012-01-278-369/+358
| | | | | | | | | | | | | | | | | | | | | | | | Self-heal does not happen if the file has change log xattr only for one of the subvol keys. This patch makes sure that xattrop is done for all the afr subvol keys after a new entry is created in entry-self-heal. 1) Added matrix create/cleanup functions 2) Impunging a new file does multiple xattrops on the source subvol, one per sink. The code can do a single xattrop after the entry is created on all the sinks. 3) Missing entry self-heal uses one frame per sink to heal the file. This leads to multiple xattrops on the source subvol. That code is changed now to use one frame which will create the file on all subvols. Change-Id: I65a42f9779b03f7efae283479f8653fb2cb8046b BUG: 762680 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com>
* cluster/afr: Stack wind with correct framePranith Kumar K2012-01-271-10/+15
| | | | | | | | | | | | *) Found possible races in _cbk fixed them as well. Change-Id: Id9a9f3cbf71f55827addb24ba2cbddecb8326b5b BUG: 784279 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2687 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* complete the implementation of missing 'f**xattr()' fopsAmar Tumballi2012-01-256-6/+143
| | | | | | | | | | | | | | in debug/* and cluster/* translators and a syncop_fsetxattr() added a test case for testing the working of 'f-fop()' on fuse mount. Change-Id: I0c2aeeb30a0fb382ef2495cca1e66b00abaffd35 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/802 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: add 'fremovexattr()' fopAmar Tumballi2012-01-253-0/+185
| | | | | | | | | | | so operations can be done on fd for extended attribute removal Change-Id: Ie026f1b53793aeb4ae33e96ea5408c7a97f34bf6 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: get xattrs also as part of readdirpAmar Tumballi2012-01-257-20/+25
| | | | | | | | | | | | | readdirp_req() call sends a dict_t * as an argument, which contains all the xattr keys for which the entries got in readdirp_rsp() are having xattr value filled dictionary. Change-Id: I8b7e1290740ea3e884e67d19156ce849227167c0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765785 Reviewed-on: http://review.gluster.com/771 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: change lk-owner as a 1k bufferAmar Tumballi2012-01-243-28/+33
| | | | | | | | | | | | | so, NLM can send the lk-owner field directly to the locks translators, while doing the same effort, also enabled sending maximum of 500 aux gid over protocol. Change-Id: I87c2514392748416f7ffe21d5154faad2e413969 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 767229 Reviewed-on: http://review.gluster.com/779 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: set loc->gfid for building root locv3.3.0qa20Shylesh Kumar2012-01-231-1/+1
| | | | | | | | | Change-Id: Icb902846d243df0502f664bfd187280cecd4397c BUG: 784176 Signed-off-by: Shylesh Kumar <shylesh@gluster.com> Reviewed-on: http://review.gluster.com/2681 Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* pump: move internal pump xattrs out of trusted domainRajesh Amaravathi2012-01-232-14/+9
| | | | | | | | | | | | | | | | | | | * the trusted.glusterfs.pump.{start|pause|commit|status|abort} xattrs have been moved out of trusted domain. This enables separation of xattrs used as gluster-internal commands (handled by pump) for replace-brick, which are not set in the back-end, from xattrs set on the replace-brick source and destinations bricks. * macros definitions from pump.h and glusterd.h, #defining these xattrs have been merged and put into libglusterfs/src/glusterfs.h Change-Id: I87b8bfbf045aa140f5d3f0c9baa9b2e79f87b67b BUG: 783049 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2663 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-206-73/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: do not unlock without holding the lock on the fdRaghavendra Bhat2012-01-191-1/+1
| | | | | | | | | | | | | | | In afr_open_fd_fix we were unlocking the local->fd->lock, without holding the lock on it if we were not able to get the fd context. Now we are directly going to out and returning, instead of going to unlock without holding the lock. Change-Id: I0da638bbd2c269127cf111b3aac707e4a95d20c6 BUG: 783036 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2658 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core/setxattr: prevent users from setting glusterfs xattrsRajesh Amaravathi2012-01-146-18/+233
| | | | | | | | | | | | | | | | | | | | | | | | * Each xlator prevents the user from setting glusterfs-internal xattrs like trusted.gfid by handling it in respective setxattr functions. The speacial case of trusted.gfid is handled in fuse (Not in posix because posix_setxattr is used to set gfid). * For xlators which did not define setxattr and/or fsetxattr, the functions have been implemented with appropriate checks. xlator | fops-added _______________|__________________________ | 1. afr | fsetxattr 2. stripe | setxatrr and fsetxattr 3. quota | setxattr and fsetxattr Change-Id: Ib62abb7067415b23a708002f884d30e8866fbf48 BUG: 765487 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/685 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cluster/afr: Remove dead codePranith Kumar K2012-01-103-55/+0
| | | | | | | | | | Change-Id: I239128c51b728fbb7814fd6a41020b76c88fbd93 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> BUG: 772876 Reviewed-on: http://review.gluster.com/2623 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle fini for afr,pumpv3.3.0qa19Pranith Kumar K2011-12-285-63/+89
| | | | | | | | | Change-Id: Idc0a05a8a25f278a7ab05e242263e0a5001bde18 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> BUG: 767862 Reviewed-on: http://review.gluster.com/800 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: EIO should overwrite ENOENT in lookupPranith Kumar K2011-12-281-2/+3
| | | | | | | | | | | | | | | In case if lookup decides there is a gfid-mismatch, some enoents and self-heal cant remove the stale entry, it tells lookup to unwind with EIO but since ENOENT has more priority it is not over-written, this patch fixes that case. Change-Id: Icd68c4a5cf05dd97c568964ab647a34fdb6e26f4 BUG: 765528 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2541 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle error cases in local initPranith Kumar K2011-12-289-516/+384
| | | | | | | | | | | | - Fop should unwind with appropriate errno - Local is de-allocated on errors Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Change-Id: I4db40342ae184fe1cc29e51072e8fea72ef2cb15 BUG: 770513 Reviewed-on: http://review.gluster.com/2539 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle split-brain/all-fool xattrs for directoryPranith Kumar K2011-12-276-147/+152
| | | | | | | | | | | | In case of split-brain/all-fool xattrs perform conservative merge. Don't treat ignorant subvol as fool. Change-Id: I6ddf89949cd5793c2abbead7c47f091e8461f1d4 BUG: 765528 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2521 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Set pargfid when missingv3.3.0qa18Pranith Kumar K2011-12-223-4/+14
| | | | | | | | | | | | client asserts for missing pargfid in case of unlink. So Afr needs to make sure it is present in that fop. Change-Id: Iea0ad65e1e7254c8df412942c52d5870e853aa51 BUG: 769055 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Fix meta data lock rangePranith Kumar K2011-12-221-1/+1
| | | | | | | | | | Change-Id: I7615f31309c6c8f5373e1ff0535d84396dfa1455 BUG: 765430 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/807 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Double the call count if transaction is for renamePranith Kumar K2011-12-131-4/+18
| | | | | | | | | | | | | In rename the changelog modification needs to happen both on old parent-dir and new parent-dir, so 2 stack winds are done per brick. Change-Id: I43f34661e397c4288162213944529e18b7724b1d BUG: 766603 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/783 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Add command-line support (but no doc) for enforce-quorum option.Jeff Darcy2011-11-286-69/+118
| | | | | | | | Change-Id: Ia52ddb551e24c27969f7f5fa0f94c1044789731f BUG: 3823 Reviewed-on: http://review.gluster.com/743 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Update read-child if it becomes stalePranith Kumar K2011-11-283-36/+30
| | | | | | | | Change-Id: I00c714a89575023f6dbdd3430dcbf191e5d08019 BUG: 3650 Reviewed-on: http://review.gluster.com/740 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>