summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht
Commit message (Collapse)AuthorAgeFilesLines
...
* cluster/dht: Handle get cached/hashed subvol failures gracefullyshishir gowda2012-02-193-4/+47
| | | | | | | | | Change-Id: I7a41c2876be04acd166b2004d9aa66af078d32ea BUG: 790328 Signed-off-by: shishir gowda <shishirng@gluster.com> Reviewed-on: http://review.gluster.com/2757 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* cluster/dht: Exit clean when assert_no_child_down is enabledVijay Bellur2012-02-191-1/+1
| | | | | | | | | Change-Id: If90b1080edcf3792f8590492b585a6dd48fac18e BUG: 783249 Signed-off-by: Vijay Bellur <vijay@gluster.com> Reviewed-on: http://review.gluster.com/2664 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* cluster/distribute: send the proper 'loc_t' for statfs()Amar Tumballi2012-02-191-9/+11
| | | | | | | | | | | | in dht-diskusage.c, which was getting used for getting free disk space of all subvolumes Change-Id: Ieb2bb5f2479fac1803b9af4ef1948954a026c2ee Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 290282 Reviewed-on: http://review.gluster.com/2767 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/dht: Rebalance will be a new glusterfs processshishirng2012-02-195-17/+750
| | | | | | | | | | | | | | | | | | | | | | | | rebalance will not use any maintainance clients. It is replaced by syncops, with the volfile. Brickop (communication between glusterd<->glusterfs process) is used for status and stop commands. Dept-first traversal of dir is maintained, but data is migrated as and when encounterd. fix-layout (dir) do Complete migrate-data of dir fix-layout (subdir) done Rebalance state is saved in the vol file, for restart-ability. A disconnect event and pidfile state determine the defrag-status Signed-off-by: shishirng <shishirng@gluster.com> Change-Id: Iec6c80c84bbb2142d840242c28db3d5f5be94d01 BUG: 763844 Reviewed-on: http://review.gluster.com/2540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* posix: handle some internal behavior in posix_mknod()Amar Tumballi2012-02-161-0/+5
| | | | | | | | | | | | | | | | | | | | assume a case of link() systemcall, which is handled in distribute by creating a 'linkfile' in hashed subvolume, if the 'oldloc' is present in different subvolume. we have same 'gfid' for the linkfile as that of file for consistency. Now, a file with multiple hardlinks, we may end up with 'hardlinked' linkfiles. dht create linkfile using 'mknod()' fop, and as now posix_mknod() is not equipped to handle this situation. this patch fixes the situation by looking at the 'internal' key set in the dictionary to differentiate the call which originates from inside with regular system calls. Change-Id: Ibff7c31f8e0c8bdae035c705c93a295f080ff985 BUG: 763844 Signed-off-by: Amar Tumballi <amar@gluster.com> Reviewed-on: http://review.gluster.com/2755 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: add an extra flag to readv()/writev() APIAmar Tumballi2012-02-144-12/+17
| | | | | | | | | | | | needed to implement a proper handling of open flag alterations using fcntl() on fd. Change-Id: Ic280d5db6f1dc0418d5c439abb8db1d3ac21ced0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2723 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* support for nano second resolution for mtime,ctime,atime attributes.krishna2012-02-091-3/+15
| | | | | | | | | Change-Id: Id5078f270d0fec280b53d4aa7b16bbaf42a2df05 BUG: 784095 Signed-off-by: krishna <ksriniva@redhat.com> Reviewed-on: http://review.gluster.com/2730 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* distribute: utilize new 'fremovexattr()' fop for rebalanceAmar Tumballi2012-02-021-1/+1
| | | | | | | | | | | | | instead of existing 'syncop_removexattr()' which is not rename proof for now. Change-Id: Ib171710645a6ee35c86d851a057b68461ecbab27 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/2691 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* complete the implementation of missing 'f**xattr()' fopsAmar Tumballi2012-01-252-0/+62
| | | | | | | | | | | | | | in debug/* and cluster/* translators and a syncop_fsetxattr() added a test case for testing the working of 'f-fop()' on fuse mount. Change-Id: I0c2aeeb30a0fb382ef2495cca1e66b00abaffd35 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/802 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: add 'fremovexattr()' fopAmar Tumballi2012-01-253-0/+62
| | | | | | | | | | | so operations can be done on fd for extended attribute removal Change-Id: Ie026f1b53793aeb4ae33e96ea5408c7a97f34bf6 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/dht: handle ENOENT err in rename fopshishir gowda2012-01-251-1/+2
| | | | | | | | | | | | | A ENOENT should not be a error propogated for rename failures. As, ENOENT can arise only due to internal unlink call of rename. Change-Id: I925622da8ef370d0385bc5b30cf8dc9b8e852beb BUG: 768879 Signed-off-by: shishir gowda <shishirng@gluster.com> Reviewed-on: http://review.gluster.com/2583 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: get xattrs also as part of readdirpAmar Tumballi2012-01-252-8/+10
| | | | | | | | | | | | | readdirp_req() call sends a dict_t * as an argument, which contains all the xattr keys for which the entries got in readdirp_rsp() are having xattr value filled dictionary. Change-Id: I8b7e1290740ea3e884e67d19156ce849227167c0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765785 Reviewed-on: http://review.gluster.com/771 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Revert "distribute: utilize new 'fremovexattr()' fop for rebalance"Harshavardhana2012-01-231-1/+1
| | | | | | | | | | | This reverts commit c4e4be31ec2783b984e7dbb9ecbc1eea84044ad0 .. There is a problem here. syncop_fremovexattr doesn't exist in the codebase yet. Have a dependency merge for this. No other patches can be uploaded without a build failure now. Change-Id: Ic2c6194d905ffcfa7cbdc29c9bc8739f628d404e Reviewed-on: http://review.gluster.com/2679 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* distribute: utilize new 'fremovexattr()' fop for rebalanceAmar Tumballi2012-01-231-1/+1
| | | | | | | | | | | | | instead of existing 'syncop_removexattr()' which is not rename proof for now. Change-Id: I3b2afbe58ce90658100cc56b01e23bed672854e8 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/803 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-203-1/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core/setxattr: prevent users from setting glusterfs xattrsRajesh Amaravathi2012-01-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | * Each xlator prevents the user from setting glusterfs-internal xattrs like trusted.gfid by handling it in respective setxattr functions. The speacial case of trusted.gfid is handled in fuse (Not in posix because posix_setxattr is used to set gfid). * For xlators which did not define setxattr and/or fsetxattr, the functions have been implemented with appropriate checks. xlator | fops-added _______________|__________________________ | 1. afr | fsetxattr 2. stripe | setxatrr and fsetxattr 3. quota | setxattr and fsetxattr Change-Id: Ib62abb7067415b23a708002f884d30e8866fbf48 BUG: 765487 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/685 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cluster/distribute: dht_aggregate() fix a logic error before xattr comparisonsHarshavardhana2012-01-121-1/+1
| | | | | | | | | Change-Id: I20f015263bed9851225005d5f41a5d518bd22592 BUG: 769691 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/2557 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* rebalance: remove the *.fix.layout xattr from trusted domain.Rajesh Amaravathi2011-12-281-1/+1
| | | | | | | | | | | moving trusted.distribute.fix.layout from trusted domain. Change-Id: If4317e4320998bbd739a8472b8814d75c425d341 BUG: 765487 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cluster/distribute: Assert checks at known locationsHarshavardhana2011-12-062-10/+26
| | | | | | | | | | | and new function dict_get_ptr_and_len(). Change-Id: I653a1cc8123baa36d750250d02721aa98b196f38 BUG: 3158 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/distribute: Add support for 'min-free-inodes' on each distribute ↵Harshavardhana2011-11-234-200/+257
| | | | | | | | | | | | | | | | | subvolume. This change is required as increasingly large number of small files would cause inodes to run out before they run out on available disk space. It is highly necessary to support algorithmic checking of inodes too just as we do for disk space. Change-Id: I9b87405328d443825e239ee80ab664aceb50ee68 BUG: 3799 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/730 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cluster/dht: set gfid in lookup locPranith Kumar K2011-11-231-0/+2
| | | | | | | | Change-Id: I59599cc88be1d973c955600fdd54e6c49c77b4a2 BUG: 3770 Reviewed-on: http://review.gluster.com/681 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* dht,afr: Fixed gfid problemsPranith Kumar K2011-11-181-0/+3
| | | | | | | | | | | *) removed uuid_generate usage in pump and afr, self-heald *) filled the gfids for the fops which were sending no gfid in loc Change-Id: I85da3c10f5ee2006248b0123155a60867870d202 BUG: 3760 Reviewed-on: http://review.gluster.com/679 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/dht: Fix a typoHarshavardhana2011-11-161-3/+3
| | | | | | | | | | Change-Id: I6bcdc7d600ebb9ef68c60319f96cd9e28d12c861 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/732 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* glusterfs: An effort to fix all the spell mistakes and typoHarshavardhana2011-11-164-7/+7
| | | | | | | | | | | | | | | in the entire glusterfs codebase. This patch fixes many of spell mistakes and typo in the entire glusterfs codebase and all supported modules. Change-Id: I83238a41aa08118df3cf4d1d605505dd3cda35a1 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: remove 'ino' variable from 'inode_t' structureAmar Tumballi2011-11-161-3/+2
| | | | | | | | Change-Id: I0f078d1753db65d2f2e0380d1b0450c114cf40dd BUG: 3518 Reviewed-on: http://review.gluster.com/522 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/distribute: Trigger selfheal only if rmdir succeeded onceshishir gowda2011-11-162-2/+7
| | | | | | | | | | | A EACCES error also should not trigger a selfheal. Only if rmdir succeeded on any subvol, a selfheal should be triggered Change-Id: Ifebb980f0ebd1548adfd3be975ca52ca44c3f3ab BUG: 3786 Reviewed-on: http://review.gluster.com/716 Reviewed-by: Amar Tumballi <amar@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/distribute: Use local call_cnt while windingshishir gowda2011-11-163-16/+20
| | | | | | | | | | | | layout->cnt might be modified in cbk's or different threads, which will lead to corruptions Change-Id: Icfdab01ac583cb3d27d62f878b79e0098b597952 BUG: 3730 Reviewed-on: http://review.gluster.com/694 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cluster/distribute lookup: send revalidate calls to all subvols for directoriesshishir gowda2011-11-161-2/+13
| | | | | | | | | | | | | | If mkdir fails on a subvolume, layout is set taking into account only the subvols where it was successful. stat does not trigger selfheal, as its layout based. Revalidate on directories needs to be sent to all subvols, to fix the error, and not just on the layout. Change-Id: If17055508ffcf268806258ed49e7d7500a89d0f2 BUG: 3793 Reviewed-on: http://review.gluster.com/693 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* Move some of the messages to log level `TRACE'.Sachidananda Urs2011-11-151-2/+2
| | | | | | | | Change-Id: I46133b5e2218b9d810251b3dadadd8f157ab07d7 BUG: 3761 Reviewed-on: http://review.gluster.com/643 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* stripe readv_cbk: Fix stat return valuesshishir gowda2011-11-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Workaround - If the read request, does not fall to the subvolume with the largest file size set, then we never return the correct size. This leads to clients seeing a truncated file error. The work around is to wipe stat being returned as part of read call. Problem - We were passing the stbuf returned by the first child/index, which can be different to the size/blocks returned by stat. This led to applications viewing the file as being truncated. The stbuf size needs to be the largest of all results, and blocks the aggregation from all subvolumes. (similar to stat) BUG: 3774 Change-Id: I46c53c18b2b42b1f5b86b05555bbab73bf993476 Reviewed-on: http://review.gluster.com/666 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* build: warning suppression (round n)Amar Tumballi2011-10-201-1/+14
| | | | | | | | | | with this patch, there are no more warnings with gcc (GCC) 4.6.1 20110908 Change-Id: Ice0d52d304b9846395f8a4a191c98eb53125f792 BUG: 2550 Reviewed-on: http://review.gluster.com/607 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* distribute: handle migration of symlink and special filesAmar Tumballi2011-10-203-58/+188
| | | | | | | | | | | | TODO: currently, wrt. rebalance/decommissioning, only pending thing is hardlink migration. Change-Id: I30cd06802e84c95601a5a081198f1f09c6d6bc01 BUG: 3714 Reviewed-on: http://review.gluster.com/578 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/dht: Handle 'user.ufo-test' sent by ufo in dht setxattr.Junaid2011-10-061-2/+53
| | | | | | | | | | In this case, send setxattr to all the bricks to test user xattr support and return error if setxattr fails on even one of the bricks. Change-Id: I9a9b27312a650d25bbc1f57e1ab8899d63d0593d BUG: 3185 Reviewed-on: http://review.gluster.com/542 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: made changes to return value of __is_root_gfid()Amar Tumballi2011-10-021-1/+1
| | | | | | | | | | | now returns 'true(1)' is gfid is root, 'false(0)' if not. earlier it was the inverse, which was bit confusing Change-Id: Id103f444ace048cbb0fccdc72c6646da06631584 BUG: 3518 Reviewed-on: http://review.gluster.com/549 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* statedump: do not print the inode number in the statedumpRaghavendra Bhat2011-10-011-36/+20
| | | | | | | | | | | | | Since gfid is used to uniquely identify a inode, in the statedump printing inode number is not necessary. Its suffecient if the gfid of the inode is printed. And do not print the the inodelks, entrylks and posixlks if the lock count is 0. Change-Id: Idac115fbce3a5684a0f02f8f5f20b194df8fb27f BUG: 3476 Reviewed-on: http://review.gluster.com/530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* distribute rebalance: preserve proper mode in destinationAmar Tumballi2011-09-301-3/+11
| | | | | | | | | | | * don't set 'sticky' and 'sgid' bits to 0 without checking if the source had those bits prior to rebalance. Change-Id: Ia826cb3cfb55312cdbf00d3421f2bd06b3103ce6 BUG: 3656 Reviewed-on: http://review.gluster.com/541 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* rebalance process: propagate proper errors to userAmar Tumballi2011-09-301-8/+45
| | | | | | | | | | | | | | | | * cluster/distribute: while rebalance, differentiate between valid errors, validation check failures (may not be migrate failure), and success. * mgmt/glusterd: differentiate the errors based on errno. If a valid error has happened, mark the rebalance status as failure, and stop the rebalnace crawl. Change-Id: I2d7bd7b73d2b79bfaf892ad4364bc89830a0d5bb BUG: 3656 Reviewed-on: http://review.gluster.com/540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Second round of warning suppression.Jeff Darcy2011-09-292-4/+0
| | | | | | | | | | | | | | | | | | | | | | Used a #pragma to kill ~170 in rpcgen code. Added GF_UNUSED to deal with a few more from macros elsewhere. The remainder are function return values (mostly context and dict calls) that really should be checked. Those would be harder to fix without real understanding of the code where they occur, so they remain as reminders. (Patchset 2: deal with older gcc that doesn't handle #pragma GCC diagnostic) (Patchset 3: fix include paths in generated files) (Patchset 4: keep up with trunk, squash 9 new warnings) (Patchset 5: six more, all in AFR) Change-Id: I29760c8c81be4d7e6489312c5d0e92cc24814b7b BUG: 2550 Reviewed-on: http://review.gluster.com/378 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/distribute: fixed a spurious inode refAmar Tumballi2011-09-271-4/+12
| | | | | | | | | | | | | | While bringing in support to open-fd migration, dht_local_init() itself started doing 'loc_copy()'. I had left one case in dht_lookup() where there is a extra loc_copy() on existing copied 'local->loc', which would cause 2 inode_refs on a given inode, and only one inode_unref() happens in dht_local_wipe(). Change-Id: Idd0375bdf9a6408db1e97e80389249ef8d802adb BUG: 3590 Reviewed-on: http://review.gluster.com/504 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com>
* cluster/distribute: validate buf before accessing.Rahul C S2011-09-221-5/+5
| | | | | | | | | | | The macro to check & reset rebalance flags was accessing the iatt structure even in case of failures leading to null dereference. Change-Id: I518f4cc9086cecbe6cf791c8a351287fe3613650 BUG: 3594 Reviewed-on: http://review.gluster.com/472 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com>
* cluster/distribute: minor fixes in open file migrationAmar Tumballi2011-09-195-37/+91
| | | | | | | | | | | * incorporated Avati's comments on the first patch. * send proper stat information while unwinding Change-Id: I36982cec610753c241c372272620ab2bd581fd9f BUG: 3071 Reviewed-on: http://review.gluster.com/408 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Fix typo in log message.Sachidananda Urs2011-09-191-1/+1
| | | | | | | | Change-Id: Ia51ffe03c8b94ddfe21c6609bc0d54b5bd29eca7 BUG: 3158 Reviewed-on: http://review.gluster.com/392 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* support for de-commissioning a node using 'remove-brick'Amar Tumballi2011-09-135-6/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to achieve this, we now create volume-file with 'decommissioned-nodes' option in distribute volume, then just perform the rebalance set of operations (with 'force' flag set). now onwards, the 'remove-brick' (with 'start' option) operation tries to migrate data from removed bricks to existing bricks. 'remove-brick' also supports similar options as of replace-brick. * (no options) -> works as 'force', will have the current behavior of remove-brick, ie., no data-migration, volume changes. * start (starts remove-brick with data-migration/draining process, which takes care of migrating data and once complete, will commit the changes to volume file) * pause (stop data migration, but keep the volume file intact with extra options whatever is set) * abort (stop data-migration, and fall back to old configuration) * commit (if volume is stopped, commits the changes to volumefile) * force (stops the data-migration and commits the changes to volume file) Change-Id: I3952bcfbe604a0952e68b6accace7014d5e401d3 BUG: 1952 Reviewed-on: http://review.gluster.com/118 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* distribute rebalance: handle the open file migrationAmar Tumballi2011-09-1213-1651/+2898
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-074-26/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/distribute: unwind the proper dict in getxattr_cbkAmar Tumballi2011-08-231-1/+4
| | | | | | | | | | without which, quota would get confused about the total size Change-Id: I0fb822ee67e3c1585f783ae35292fe71c47ee249 BUG: 3421 Reviewed-on: http://review.gluster.com/304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* xlator options: revamp xlator option validation/reconfigure codeAnand Avati2011-08-181-143/+22
| | | | | | | | | | | | | | | | | - move option handling to options.c (new file) - remove duplication of option validation code - remove duplication of gf_log / sprintf - get rid of xlator_t->validate_options - get rid of option validation in rpc-transport - get rid of validate_options() in every xlator - use xlator_volume_option_get to clean up many functions - introduce primitives to init/reconfigure option types Change-Id: I51798af72c8dc0a2b9e017424036eb3667dfc7ff BUG: 3415 Reviewed-on: http://review.gluster.com/235 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* allocate extra bytes (for \0) when calling reallocVenky Shankar2011-08-171-1/+2
| | | | | | | | | | | | We use strcat to concat pathinfo strings. strcat appends a \0 at the end. Therefore allocate extra bytes for the same to avoid asserts being hit during gf_free call Change-Id: I582f5858e7375a5bacfc5c702c612ee09c3bb355 BUG: 3413 Reviewed-on: http://review.gluster.com/249 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* adjust allocated length to memory overrun fixVenky Shankar2011-08-151-3/+4
| | | | | | | | | | | size of the allocated length is incorrectly calculated which could cause memory overrun while filling the buffer Change-Id: I4fbdbd1fff937ca15bae9f634ef5757dda52caa8 BUG: 3413 Reviewed-on: http://review.gluster.com/236 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-1013-13/+13
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>