glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	distribute rebalance: handle the open file migration	Amar Tumballi	2011-09-12	15	-1654/+2906
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	geo-rep: partial support for unprivileged gsyncd via mountbroker	Csaba Henk	2011-09-12	5	-46/+197
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gsyncd: - mounting code is split to a direct and a mountbroker based backend - option gluster-command gone - new options: gluster-params, gluster-cli-options, mountbroker - mountbroker mount backend is used if either a mountbroker label is given through the mountbroker option, or if gsyncd is unprivileged; in this case the username is used as label - have gluster cli invocations log to stderr so that we don't hit a permission issue with the logfiles glusterd: - do gsyncd pre-config with new options - add option geo-replication-log-group, so if that specified geo-rep logfile directories are given to that group (and thus members of the given group can do logging there) This is just WIP as geo-rep relies on trusted extended attributes and those are not accessible for unprivileged users. Even if we solved this issue, glusterd security settings are too coarse, so that if we made it possible for an unprivileged gsyncd to operate, we would open up too far. Change-Id: Icd520b58cbadccea3fad7c0f437b99de1e22db14 BUG: 2825 Reviewed-on: http://review.gluster.com/399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	glusterd / cli: mount-broker service	Csaba Henk	2011-09-12	7	-3/+1027
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mountbroker is configured in glusterd volfile through a DSL which is restriced enough to be able to appear in the role of the value of a volfile knob. Basically the DSL describes set-theorical requirements against the option set which is sent by the cli (in the hope of getting a mount with these options). If the requirements meet and the volume id and the uid who is to "own" the mount can be unambigously deduced from the given request, glusterd does the mount with the given parameters. The use case of geo-replication is sugared by means of volume options which then generate a complete mount-broker option set. Demo: - add the following option to your glusterd volfile: option mountbroker-root /tmp/mbr option mountbroker.fool EQL(volfile-id=pop\|user-map-root=\|volfile-server=localhost)&MEET(user-map-root=john\|user-map-root=jane) - before starting glusterd, create /tmp/mbr owned by root with mode 0755 - with cli, do $ gluster system:: mount fool volfile-id=pop33 user-map-root=jane volfile-server=localhost - on succesful completion (volume pop33 exists and is started, jane is a valid username), the mount path will be echoed to you - you can get rid of the mount by $ gluster system:: umount <mount-path> Change-Id: I629cf64add0a45500d05becc3316f67cdb5b42ff BUG: 3482 Reviewed-on: http://review.gluster.com/128 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	add --user-map-root option	Csaba Henk	2011-09-12	2	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes client fake that given user is a superuser, by changing FUSE requests coming with uid of user so that uid is set to 0. User can be given in numeric form, in which case it's treated as an uid directly, or else it's tried to be resolved to an uid with getpwnam(3). Implies --acl. Change-Id: I2d5a3d3e178be7ffdf22b46a56f33a7eeaaa7fe1 BUG: 3242 Reviewed-on: http://review.gluster.com/127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	gsyncd: python3 compat fixes	Csaba Henk	2011-09-12	4	-5/+36
\| \| \| \| \| \| \| \| \| \| \|	Also add __codecheck script which can verify if source is OK at the syntactical level with a given Python interpreter. Change-Id: Ieff34bcd3efd1cdc0e8f9a510c05488f35897bbe BUG: 1570 Reviewed-on: http://review.gluster.com/320 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	glusterd: fix cleaning up of runner object	Csaba Henk	2011-09-12	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	in lack of that, if geo-rep component is not installed, glusterd got a zombie child Change-Id: Ic4a2a4ffc943de68dd02db76a32b1618821ddf56 BUG: 2744 Reviewed-on: http://review.gluster.com/317 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	glusterd: free the allocated string to avoid memory leak	Raghavendra Bhat	2011-09-12	2	-56/+24
\| \| \| \| \| \| \| \|	Change-Id: I520abf3c57a15be8bb7dd1e92ad0b049ef5c8970 BUG: 3341 Reviewed-on: http://review.gluster.com/394 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	protocol/client: avoid code duplication in fd based operations	Raghavendra Bhat	2011-09-11	2	-340/+42
\| \| \| \| \| \| \| \| \|	Change-Id: I012f78bac8ba82333628c59ef51d5e5f43d05ac7 BUG: 3158 Reviewed-on: http://review.gluster.com/329 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	features/marker: unref the local incase of errors before unwinding	Raghavendra Bhat	2011-09-11	1	-3/+5
\| \| \| \| \| \| \| \|	Change-Id: I4dcad7ddf84bf98b4b7f4a0e407a418426674280 BUG: 2784 Reviewed-on: http://review.gluster.com/299 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	mgmt/glusterd: volume set help-xml format change	Vijay Bellur	2011-09-11	1	-1/+1
\| \| \| \| \| \| \| \|	Change-Id: I503364c855d52605e301f4d3c205af6c9fc0e1df BUG: 3366 Reviewed-on: http://review.gluster.com/380 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
*	features/marker-quota: Prefix the function names with mq (marker-quota).	Junaid	2011-09-09	5	-310/+310
\| \| \| \| \| \| \| \| \| \| \|	This is to fix to bug marker translator and quota translator cannot co-exist in same process. Change-Id: I9f132b663f03641f4f2c7e168df8400adbc5570f BUG: 3020 Reviewed-on: http://review.gluster.com/381 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	cluster/afr: perform self-heal with least priority	Pranith Kumar K	2011-09-09	1	-0/+7
\| \| \| \| \| \| \| \|	Change-Id: Id8a1dffa3c3200234ad154d1749278a2d7c7021b BUG: 3502 Reviewed-on: http://review.gluster.com/336 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	glusterd rebalance: make co-operate with all other 'op'	Amar Tumballi	2011-09-09	4	-56/+380
\| \| \| \| \| \| \| \| \| \| \| \|	that way, we can share the rebalance state with other peers and can prevent confusion/conflicts when multiple rebalances are done by different peers. Change-Id: I24159e69332644718df7314f6f1da7fce9ff740e BUG: 2112 Reviewed-on: http://review.gluster.com/343 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	features/marker-quota: Perform xattr related operations with root ↵	Junaid	2011-09-09	2	-6/+39
\| \| \| \| \| \| \| \| \| \|	permissions in rename fop. Change-Id: Id9ac1ecdd9753377c9eb24464f51dcbdc0cd2821 BUG: 3194 Reviewed-on: http://review.gluster.com/367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	performance/io-threads: treat -ve pid as request for fop with least priority	Pranith Kumar K	2011-09-08	1	-63/+325
\| \| \| \| \| \| \| \|	Change-Id: Ib6730a708f008054fbd379889a0f6dd3b051b6ad BUG: 3502 Reviewed-on: http://review.gluster.com/335 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	cluster/afr: Make data selfheal trigger to be configurable.	Pranith Kumar K	2011-09-08	11	-112/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By default, lookup triggers data self-heal but that is not the preferred way of operating replicated volumes. We would like the data self heals to be triggered in open instead. Number of back-ground self-heals allowed is 16 and lookups block until self-heal is completed. We want to prevent blocking in fops. We can not make lookups independent of self-heal frames because when there are gfid conflicts the decision of which file is correct is determined in self-heal phase. So in afr, lookup self-heal is going to guarantee name space consistency and open/fd fops will take responsibility for data consistency, these are non blocking. The user needs to set the option cluster.data-self-heal "open" for this behavior. Change-Id: If9463cdb9ebac114708558ec13bbca0270acd659 BUG: 3503 Reviewed-on: http://review.gluster.com/334 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	posix-acl: configurable super user ID	Anand Avati	2011-09-08	2	-7/+61
\| \| \| \| \| \| \| \| \| \| \| \|	In configurations with a uid mapper, super user ID could be mapped to a non-zero value. Hence making it configurable in access control would be necessary for proper super-user semantics. Change-Id: I51e8e0395680e9b96a99657a0af547659bd9affe BUG: 2815 Reviewed-on: http://review.gluster.com/332 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	cluster/afr: eager locking of FD writes	Anand Avati	2011-09-08	4	-58/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is a change in the way write transactions hold a lock which optimizes the case of sequential writes from a single writer. Lock phase of a transaction has two sub-phases. First is an attempt to acquire locks in parallel by broadcasting non-blocking lock requests. If lock aquistion fails on any server, then the held locks are unlocked and revert to a blocking locked mode sequentially on one server after another. The change in this patch is to make the initial broadcasting lock request attempt to acquire lock on the entire file. If this fails, we revert back to the sequential "regional" blocking lock as before. In the case where such an "eager" lock is granted in the non-blocking phase, it gives rise to an opportunity for optimization. i.e, if the next write transaction on the same FD arrives before the unlock phase of the first transaction, it "takes over" the full file lock. Similarly if yet another transaction arrives before the unlock phase of the "optimized" transaction, that in turn "takes over" the lock as well. The actual unlock now happens at the end of the last "optimzed" transaction. Any operation which arrives before the unlock phase of the previous transaction is a potential candidate to become an "optimized" transaction. In cases where the previous transaction had aquired lock as a "regional" blocking lock, and the next transaction comes in before its unlock phase, then it would not be an "optimized" transaction. Implied assumption ------------------ Since two or more transactions can now operate within the same large lock, there is a possibility that overlapping transactions can arrive at oppoosite orders on the servers. However in the larger picture this is not possible as write-behind already ensures that no two overlapping writes on an inode are in transit at the same time. Overlapping writes across clients are not a problem as they compete at locks anyways. Theoretical benefits and potential harms ---------------------------------------- In case of a single writer: The benefits are large for sequential writes. In the best case the entire file write can happen with just one lock and unlock per server, provided writes are coming in fast enough and getting pipelined by write-behind soon enough (which is usually the case). If the writes are not coming in fast enough, then the optimization "kicks in" for only those subsets of writes which are close enough to get "piggybacked". For random writes the benefits are the same as well. In any case the overall performance is better than or equal to the performance without this optimization for a single writer. In case of multiple writers: When multiple writers are not writing concurrently, there is no negative performance impact. When multiple writers are writing concurrently to the same region, there is no negative impact either, as they were previously getting arbitrated at the locks translator too. In the case of multiple writers writing to different regions concurrently, there will be an increased number of "failovers" from failed parallel non-blocking to sequential blocking regional locks. This above "worst case" has a simple workaround that as soon as we detect > 1 open-fd-count in lookup xattr, we can disable this optimization on those fds. Beneficial side-effects ----------------------- There is another similar optimization in AFR for changelogs which goes by the name of "changelog-piggybacking". That works in a similar way where pending flags get 'taken over' or 'piggybacked' by the next transaction if its 'pre-op' phase kicks in before the 'post-op' phase of the previous transaction. It has been observed that this changelog-piggybacking optimization gives a saving of about ~55% savings of xattr calls hitting the wire, measured across various types of network interfaces. The side effect of this eager-lock optimization is that it gives an almost 100% saving of xattr calls by making the optimistic-changelog work much more efficiently as it gives a wider overlap of the xattr phases of two consecutive transactions. Change-Id: I41c02eb3b64c14c68ef66a344610ec3f024cd59d BUG: 3409 Reviewed-on: http://review.gluster.com/240 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	storage/posix: posix getxattr log enhancement	Rajesh Amaravathi	2011-09-08	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	Now the key is logged with getxattr failure. Change-Id: I96a9234cf138ae0922dc403e2fddcd4df0d89df8 BUG: 3283 Reviewed-on: http://review.gluster.com/373 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	mount/nfs: Gluster nfs crashes with subdirectory mount	Rajesh Amaravathi	2011-09-08	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Glusterfs used to crash trying to dereference a NULL pointer. Also, in mnt3_resolve_export_subdir, volume name was prefixed to sub directory exported, resulting in mount fail of sub directory. Fixed both issues. Change-Id: I746f0c244b4cbf03033d73ac3e40518762d76385 BUG: 3481 Reviewed-on: http://review.gluster.com/323 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	Save the mode flags set by the application when ACLs are in use	Pavan T C	2011-09-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While inheriting the ACLs from a directory that has default ACLs, make sure that the mode flags set by the application are saved. It is required to inherit only the Read, Write and Execute permissions while leaving the others viz. setuid, setgid and sticky bit untouched hence honouring the requests made by the application during create operations (mknod, mkdir et al). For a description of the problem, root cause and evaluation, refer: http://bugs.gluster.com/show_bug.cgi?id=3522 Change-Id: I994077fb321a35d8254f0cc5a7de99a17ec40c47 BUG: 3522 Reviewed-on: http://review.gluster.com/368 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	gsyncd: do the homework, document _everything_	Csaba Henk	2011-09-08	9	-17/+483
\| \| \| \| \| \| \| \|	Change-Id: I559e6a0709b8064cfd54c693e289c741f9c4c4ab BUG: 1570 Reviewed-on: http://review.gluster.com/319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
*	nfs3: Resolve entry vs. hash conflict at same dir depth	Shehjar Tikoo	2011-09-07	2	-15/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intro Note ========== The current code in hard fh resolution takes the first-match approach, i.e. which ever dirent either matches the hash or matches the gfid first is the one chosen as the result for the next step of fh resolution. In the latter case, i.e., dirent matches the gfid, we the next step is to conclude the fh resolution by returning the entry whose gfid matched. In the former, i.e., the hash matches the dirent, we choose the hash-matching dirent as the next directory to descend into, for searching the file to be operated upon. Problem ======= When performing hard fh resolution, there can be a situation where: o the hash of the primary entry,i.e. the entry we're looking for and the hash of another sibling directory, match. Note the use of "sibling", meaning both the primary entry and the hash matching one are in the same directory, i.e., their filehandle.hashcount will be same. o the sibling directory is encountered first during the dir search. Because of the current code described in "Intro", we'll end up descending into the sibling directory even though the correct behaviour is to ignore this and wait till we encounter the primary entry in the same parent directory. Once we end up descending into this sibling directory, the directory depth validation check fails. The check fails because it notices that the resolution is attempting to open a directory that is deeper in the fs tree than the file we're looking for. When this check fails, we return an ESTALE. So basically, a false-positive results in an estale to Specsfs. This is not a theoretical situation. Me and Avati saw this on specsfs test where sfs created terabytes-sized file system for its tests. The number of files was so huge in a single directory that the hashes of two entries ended up colliding. Change-Id: I4a6df11d326a67a507b1cd716c2c8e00b5a858a4 BUG: 3510 Reviewed-on: http://review.gluster.com/357 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shehjar Tikoo <shehjart@gluster.com>
*	Eliminate many "var set but not used" warnings with newer gcc.	Jeff Darcy	2011-09-07	41	-396/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	glusterd: send the 'stripe-count' value to peer during handshake	Amar Tumballi	2011-09-07	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \|	without which, if a peer is added after volume of type 'stripe-replica' is created, it won't be reflected in the newly added peer. Change-Id: I77ee6aa3f33994bd4c6dbfefd853cc7e7491c1db BUG: 3523 Reviewed-on: http://review.gluster.com/369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	glusterd: fail the 'peer probe' if the first connect attempt fails	Amar Tumballi	2011-09-07	1	-4/+48
\| \| \| \| \| \| \| \| \| \| \|	so 'gluster peer probe' command doesn't hang till timeout (120s), instead it will send the proper error msg to client. Change-Id: I398fa16d526f869f1d27eeb57aeb7ee4451fbecd BUG: 1852 Reviewed-on: http://review.gluster.com/342 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	modify to the way we used XDR definitions files (.x files)	Amar Tumballi	2011-09-07	24	-466/+310
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier: step 1: copy the existing <xdr>.x files to /tmp step 2: generate '.[ch]' files using 'rpcgen <xdr>.x' step 3: check diff with the to the existing files, add only your part of changes back to the original file. (ignore other changes). step 4: there is another file to write wrapper functions to convert structures to/from XDR buffers, update it with your new structure. step 5: use these wrapper functions in the newly written procedures. step 6: commit :-\| Now: step 1: update (mostly adding only) the <xdr>.x file step 2: run '<path-to-src>/extras/generate-xdr-files.sh <xdr>.x' command step 3: implement rpc procedure to handle the request/response. step 4: commit :-) Change-Id: I219f9159fc980438c86e847c6b030be96e595ea2 BUG: 3488 Reviewed-on: http://review.gluster.com/341 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	glusterfs protocol: bring in variable sized iobuf support	Amar Tumballi	2011-09-07	9	-198/+297
\| \| \| \| \| \| \| \| \| \| \|	is a step towards reducing glusterfs memory footprint. should also help a bit in overall performance. Change-Id: I074d5813602b2c960d59562e792b3dc6e43d2f42 BUG: 3475 Reviewed-on: http://review.gluster.com/322 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	cluster/afr: Prevent double big lock when data self-heal loops are not spawned	Pranith Kumar K	2011-09-06	2	-7/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The steps in normal data self heal: 1) take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) spawn loop frames which perform self-heal by taking small locks on the file. Every time a new lock is taken and the old lock is released. 3) Before releasing the final small lock a big lock is taken by the self-heal frame, and unlock on small-lock. Erasing of the pending xattrs happen then the big unlock happen and that is the end of the data self-heal. When a data self-heal is needed for a file and the fop that triggers the self-heal is open with O_TRUNC. Fuse sends open then an explicit truncate for this. Open triggers the self-heal but by the time it tries to spawn the loops the file size is truncated to 0, so no loops are formed. These are the steps: 1) Take big lock by self-heal frame. Get the xattrs/stat to decide source, sink information. 2) loop frames are not spawned. The big lock is not released. 3) One more big lock is taken by the same self-heal frame, Erasing of the pending xattrs etc happen, now it does two big unlocks, but after the first unlock, the information on which the locks were performed is forgotten, so the next unlock becomes a no-op. So there is a stale big lock on that file preventing further writes. As a fix, if the loops are not spawned, use the previous big lock to perform the rest of the operations needed in completing the data self-heal. No need to have one more big lock. Change-Id: Id03171269594e447b2b6d1331e362d83bd1e3430 BUG: 3506 Reviewed-on: http://review.gluster.com/339 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	cluster/afr: Bring down the self-heal window size to 1	Pranith Kumar K	2011-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This is brought in an effort to be nice to the system resources when self-heal is in progress. Change-Id: I123f1eb4d8000613a35c0117f0aa27f926f3a921 BUG: 3503 Reviewed-on: http://review.gluster.com/333 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
*	mgmt/glusterd: code re-structuring	Amar Tumballi	2011-09-05	10	-6870/+7140
\| \| \| \| \| \| \| \| \| \|	created new files per operations, (or group of operations) Change-Id: Iccb2a6a0cd9661bf940118344b2f7f723e23ab8b BUG: 3158 Reviewed-on: http://review.gluster.com/281 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	glusterd: Removed local cli lock	Krishnan Parthasarathi	2011-09-04	8	-619/+348
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change contains, - removal of the local cli lock used to serialize cli ops to a glusterd. - glusterd's state-machine can handle competing 'lockers' with guaranteed progress. - flush cluster lock on 'owner' disconnecting and as 'owner', send unlock to all on first peer disconnect. Change-Id: I25961436b0790b4196f2b3438b105c37279399ad BUG: 3320 Reviewed-on: http://review.gluster.com/123 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	performance/io-threads: Introduce new priority and priority-thread-limits	Pranith Kumar K	2011-08-31	2	-5/+89
\| \| \| \| \| \| \| \| \|	Change-Id: I7b4e7c467b833bc5896808e6e1d1b1a0322c4fdb BUG: 3483 Reviewed-on: http://review.gluster.com/318 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	mem-pool: Make mem-pool ptr avialable in ptr	shishir gowda	2011-08-25	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The header of the ptr returned from mem-pool will now store the mem-pool ptr it belongs to. mem_put will now take only the pointer to be freed. Also, changing MALLOC call to GF_CALLOC in mem_get when we run out of entries in mem-pool. This also will have the header information saved. Change-Id: I3de182663a7f5b49c9e9425e9531775b70bdff67 BUG: 3390 Reviewed-on: http://review.gluster.com/205 Reviewed-by: Amar Tumballi <amar@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	gsyncd: refine command invocation	Csaba Henk	2011-08-25	4	-32/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use subprocess module instead of os.spawn* / ad-hoc fork/exec. With this, we do now: - close uneeded files in children - watch childrens' stderr: - have a thread which collects childrens' stderr into a ring buffer (so that stderr pipe doesn't get stuffed) - on command failure show stderr - distinguish between rsync exit values, tolerate only partial errors - if connection is broken to slave, show ssh/slave gsycd's stderr Change-Id: Ia92f57b5bdfa47f8c44375c50cf279006a0bf69b BUG: 2946 Reviewed-on: http://review.gluster.com/85 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Kaushik BV <kaushikbv@gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
*	glusterd: do preparative gsyncd invocations with proper logging	Csaba Henk	2011-08-24	1	-33/+21
\| \| \| \| \| \| \| \|	Change-Id: I28de4cce140faf1b35ecdc5cbd408f21c9926341 BUG: 3231 Reviewed-on: http://review.gluster.com/96 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	features/locks: avoid using reqlock to prevent race	Raghavendra Bhat	2011-08-24	2	-2/+2
\| \| \| \| \| \| \| \|	Change-Id: Id8613f9641f748f996062342878070ba8fb27339 BUG: 2473 Reviewed-on: http://review.gluster.com/312 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
*	glusterd / geo-rep: in status, display slave URLs in simpler normalized form	Csaba Henk	2011-08-23	3	-151/+271
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ie. instead of writing out the fully expanded canonical URL like ssh://root@192.168.3.4:gluster://127.0.0.1:bar we just display ssh://root@starship::bar Change-Id: I2bd70650cbc9973d925f652bccb163d391e406c9 BUG: 2536 Reviewed-on: http://review.gluster.com/79 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
*	performance/stat-prefetch: fix memory leak	Raghavendra G	2011-08-23	1	-1/+8
\| \| \| \| \| \| \| \|	Change-Id: I84580e297ba93a9a093c2e3432ea52e3c0db4a1a BUG: 3467 Reviewed-on: http://review.gluster.com/307 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	cluster/distribute: unwind the proper dict in getxattr_cbk	Amar Tumballi	2011-08-23	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	without which, quota would get confused about the total size Change-Id: I0fb822ee67e3c1585f783ae35292fe71c47ee249 BUG: 3421 Reviewed-on: http://review.gluster.com/304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	features/marker-quota: Fix invalid reads in readdir_cbk.v3.3.0qa7	Junaid	2011-08-22	1	-8/+24
\| \| \| \| \| \| \| \|	Change-Id: Icc1e9dc039f1f2d7ee94c689779a715a69d373fa BUG: 3389 Reviewed-on: http://review.gluster.com/296 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	mgmt/glusterd: Initialize local variable in volgen	Vijay Bellur	2011-08-22	1	-1/+3
\| \| \| \| \| \| \| \|	Change-Id: I84b4f7c9c2787334ce67e5c3e0534953b691c8e0 BUG: 3460 Reviewed-on: http://review.gluster.com/295 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com>
*	cluster/afr: Perform flush on all the children involved in self-heal	Pranith Kumar K	2011-08-22	1	-19/+6
\| \| \| \| \| \| \| \|	Change-Id: I66362a3087a635fb7b759d7836a1f6564a6a7fc9 BUG: 3456 Reviewed-on: http://review.gluster.com/294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	cluster/afr: Change definition of stale child	Pranith Kumar K	2011-08-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The code is checking for priv->child_up[i], which can change while the fop is in progress. Since pending[child][id-of-transaction] alone is enough to tell if the child became stale or not, use just that. Change-Id: I494bf02cca66f4fd41526195fafce86a202c6bd1 BUG: 3455 Reviewed-on: http://review.gluster.com/293 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	cluster/afr: Paused fop should not continue with fop	Pranith Kumar K	2011-08-22	3	-3/+11
\| \| \| \| \| \| \| \|	Change-Id: Idce22a6266c354e327d5d717715d2e62533eec58 BUG: 3448 Reviewed-on: http://review.gluster.com/292 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	features/marker: changes in marker to avoid race conditions and corruptionsv3.3.0qa6	Raghavendra Bhat	2011-08-21	4	-61/+217
\| \| \| \| \| \| \| \|	Change-Id: I38ddfab200d59dd4c8e9f9dd964a98f3d7aa7ab7 BUG: 3389 Reviewed-on: http://review.gluster.com/289 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	protocol/client: Changes to be benign to replace-brick	Vijay Bellur	2011-08-21	1	-5/+5
\| \| \| \| \| \| \| \|	Change-Id: Ic227781760a5f6dbf8aad69a19f90e45d4aaec13 BUG: 3415 Reviewed-on: http://review.gluster.com/288 Reviewed-by: Krishnan Parthasarathi <kp@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	debug/io-stats: Handle loglevel in init	Vijay Bellur	2011-08-21	1	-0/+8
\| \| \| \| \| \| \| \|	Change-Id: I5aa6ee7509a8f730ca64e2f7bada56d502785a6c BUG: 3415 Reviewed-on: http://review.gluster.com/287 Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	glusterd: replace-brick status was not 'shared' with peer.v3.3.0qa5	Krishnan Parthasarathi	2011-08-21	3	-76/+297
\| \| \| \| \| \| \| \|	Change-Id: Ia2d89fd919b077232a37debc2aebe1bc72150856 BUG: 3432 Reviewed-on: http://review.gluster.com/285 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
*	cluster/afr: fop should not continue if it is paused, until resumes	Pranith Kumar K	2011-08-21	2	-0/+8
\| \| \| \| \| \| \| \| \|	Change-Id: Ie026ebed98cf5ff75ae1a13437d29f67d0e0254a BUG: 3448 Reviewed-on: http://review.gluster.com/286 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>