glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/dht: Changed log level to DEBUG	Nithya Balachandran	2015-03-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changing log level from GF_LOG_INFO to GF_LOG_DEBUG to prevent logs getting flooded. This is the same fix as addressed by: http://review.gluster.org/#/c/8621 Change-Id: I6fa04a848fe1aa5829c7c74d2ef9a5636d2dbda4 BUG: 1206120 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/10008 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/dht: set op_errno correctly during migration.	Shyam	2015-03-05	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \|	Backport of, http://review.gluster.org/#/c/6219/6 Change-Id: Ic3419dd05b4dbe49b6adf5648bdbe137722a6d04 BUG: 1166278 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9599 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/dht: Fixed double UNWIND in lookup everywhere code	Kaleb S. KEITHLEY	2015-03-03	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	backport of http://review.gluster.org/#/c/8666/2 on master In dht_lookup_everywhere_done: At line 1107 we call DHT_STACK_UNWIND and in the same if condition we go ahead and call, goto unwind_hashed_and_cached; which at Line 1371 calls another UNWIND. As is obvious, higher frames could cleanup their locals and on receiving the next unwind could cause a coredump of the process. Fixed the same by calling the required return post the first unwind Change-Id: I4629680af7ebecd9828941d883e33fb6e43d9703 BUG: 1151397 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/9765 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	Cluster/DHT: Changing rename log severity	Nithya Balachandran	2014-10-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changing log level for a rename message from debug to info to improve debuggability Change-Id: I53031fcf97fffd62095692477330ecde0cf47dcd BUG: 1139998 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8582 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/8685 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Rename should not fail post hardlink creation	Shyam	2014-10-20	2	-45/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the rename path, we wind the creation of newname hardlink and linkto file in dst hashed a the same time. If the linkto creation fails, but the link creation succeeds, we enter the failure code and cleanup the created newname hardlink. In the interim if another client looks up newname and finds it as a hardlink from FUSE, it could send an unlink for oldname instead of a rename. This combined with the above cleanup code could end up losing all the files copies, and thereby losing data. This fix separates these steps into 2 parts, creating the linkto first and then the link file, so that post link file creation no failures would cleanup the newname file. If linkto fails then link is not attempted, thereby not polluting the name space with newname. Change-Id: I61da8e906060da16a31ea1076eec2f01fd617f44 BUG: 1139998 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8570 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8683 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Treat linkto file rename failure as non-critial error	Shyam	2014-10-20	1	-6/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is a critical failure iff we fail to rename the cached file if the rename of the linkto failed, it is not a critical failure, and we do not want to lose the created hard link for the new name as that could have been read by other clients. NOTE: If another client is attempting the same oldname -> newname rename, and finds both file names as existing, and are hard links to each other, then FUSE would send in an unlink for oldname. In this time duration if we treat the linkto as a critical error and unlink the newname we created, we would have effectively lost the file to rename operations. Repercussions of treating this as a non-critical error is that we could leave behind a stale linkto file and/or not create the new linkto file, the second case would be rectified by a subsequent lookup, the first case by a rebalance, like for all stale linkto files Change-Id: Ia53ad8b43c3cf8f48ef5b43fd1fec4274e807556 BUG: 1139998 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8563 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8682 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: synchronize rename and file-migration	Raghavendra G	2014-10-20	3	-43/+292
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I4f243c946f76d440680b651235f925e3d0ebf0fd Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8523 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> BUG: 1139998 Reviewed-on: http://review.gluster.org/8681 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: introduce locking api.	Raghavendra G	2014-10-20	5	-1/+662
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I41389ba91951d3e63e617aa32cd0bee848261c72 BUG: 1139998 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8521 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8679 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Fix dht_access treating directory like files	Shyam	2014-10-20	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the cluster topology changes due to add-brick, all sub volumes of DHT will not contain the directories till a rebalance is completed. Till the rebalance is run, if a caller bypasses lookup and calls access due to saved/cached inode information (like NFS server does) then, dht_access misreads the error (ESTALE/ENOENT) from the new subvolumes and incorrectly tries to handle the inode as a file. This results in the directories in memory state in DHT to be corrupted and not heal even post a rebalance. This commit fixes the problem in dht_access thereby preventing DHT from misrepresenting a directory as a file in the case presented above. Change-Id: Idcdaa3837db71c8fe0a40ec0084a6c3dbe27e772 BUG: 1139997 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8678 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Prevent dht_access from going into a loop.	shishir gowda	2014-10-20	3	-1/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If access fails with ENOTCONN, do not wind to same subvol. We wind to first-up-subvol if access fails with ENOTCONN. In few cases, if dht has only 1 subvolume, and access fails with ENOTCONN, we go into a infinite loop of winding to same subvol The fix is to check if we previously wound to same subvol, and fail if first-up-subvol is same. Change-Id: Ib5d3ce7d33e8ea09147905a7df1ed280874fa549 BUG: 1139996 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/8677 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	dht: fix rename race	Nithya Balachandran	2014-10-20	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Additional check to check if we created the linkto file before deleting it in the rename cleanup function Change-Id: I919cd7cb24f948ba4917eb9cf50d5169bb730a67 BUG: 1139988 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8338 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8676 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Fix races to avoid deletion of linkto	Venkatesh Somyajulu	2014-10-20	3	-59/+617
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	file Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 scenario-1: =========== STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Reviewed-on: http://review.gluster.org/8231 **************************************************************************** scenario-2: =========== cluster/dht: Modified logic of linkto file deletion on non-hashed Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Reviewed-on: http://review.gluster.org/8345 ***************************************************************************** scenario-3: =========== cluster/dht: Added keys in dht_lookup_everywhere_done Case where both cached (C1) and hashed file are found, but hash does not point to above cached node (C1), then dont unlink if either fd-is-open on hashed or linkto-xattr is not found. Reviewed-on: http://review.gluster.org/8429 BUG: 1139995 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Change-Id: I86d0a21d4c0501c45d837101ced4f96d6fedc5b9 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8674 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	DHT/Create : Failing to identify a linkto file in lookup_everywhere_cbk path	Susant Palai	2014-10-20	1	-5/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case a file is not found in its cached subvol we proceed with dht_lookup_everywhere. But as we dont add the linkto xattr to the dictionary, we fail to identify any linkto file encountered.The implication being we end up thinking the linkto file as a regular file and proceed with the fop. Change-Id: Iab02dc60e84bb1aeab49182f680c0631c33947e2 BUG: 1139992 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/8277 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/8673 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	dht: fix rename race	Jeff Darcy	2014-10-20	2	-4/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If two clients try to rename the same file at the same time, we sometimes end up with no file at all in either the old or new location. That's kind of bad. The culprit seems to be some overly aggressive cleanup code. AFAICT, based on today's study of the code, the intent of the changed section is to remove any linkfile we might have created before the actual rename. However, what we're removing might not be our extra link. If we're racing with another client that's also doing a rename, it might be the only remaining link to the user's data. The solution, which is good enough to pass this test but almost certainly still not complete, is to be more selective about when we do this unlink. Now, we only do it if we know that, at some point, we did in fact create the link without error (notably ENOENT on the source or EEXIST on the destination) ourselves. Change-Id: I8d8cce150b6f8b372c9fb813c90be58d69f8eb7b BUG: 1139988 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8672 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	DHT/readdirp: Directory not shown/healed on mount point if exists on single ↵	Susant Palai	2014-10-20	3	-5/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	brick(non first up subvolume). Problem: If snapshot is taken, when mkdir has succeeded only on hashed_subvolume, then after restoring snapshot the directory is not shown on mount point. Why: dht_readdirp takes only those directory entries in to account, which are present on first_up_subvolume. Hence, if the "hashed subvolume" is not same as first_up_subvolume, it wont be listed on mount point and also not healed. Solution: Case 1: (Rebalance not running)If hashed subvolume is NULL or down then filter in first_up_subvolume. Other wise the corresponding hashed subvolume will take care of the directory entry. Case 2: If readdirp_optimize option is turned on then read from first_up_subvol Change-Id: Idaad28f1c9f688dbfb1a8a3ab8b244510c02365e BUG: 1139986 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/7599 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8671 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	dht/rebalance: Do not allow rebalance when gfid mismatch found	Venkatesh Somyajulu	2014-10-20	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to race condition, it may so happen that, gfid obtained in readdirp and gfid found by lookup are different for a given name. in that case do no allow the rebalance. Readdirp of an entry will bring the gfid, which will be stored in the inode through inode_link, and when lookup is done and gfid brought by lookup is different from the one stored in the inode, client3_3_lookup_cbk will return ESATLE and error will be captured by rebalance process. Change-Id: Iad839177ef9b80c1dd0e87f3406bcf4cb018e6fa BUG: 1139984 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/7973 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8670 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Fix dict_t leaks in rebalance process' execution path	Krutika Dhananjay	2014-09-23	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/8763 Two dict_t objects are leaked for every file migrated in success codepath. It is the caller's responsibility to unref dict that it gets from calls to syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file without freeing them. Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using the response dict. Hence, NULL is now passed as opposed to @dict to syncop_getxattr(). Change-Id: I59556ee6e135e7e65d4ddd31ba0f39e7eb50b02d BUG: 1144792 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8789 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Don't do extra unref in dht-migration checks	Vijay Bellur	2014-06-18	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: syncop_open used to perform a ref in syncop_open_cbk so the extra unref was needed but now syncop_open_cbk does not take a ref so no need to do extra unref. Fix: remove the extra fd_unref and let dht_local_wipe do the final unref. Change-Id: Ibe8f9a678d456a0c7bff175306068b5cd297ecc4 BUG: 961615 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8029 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Joe Julian <joe@julianfamily.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	core: fix Ubuntu code audit (cppcheck) results	Kaleb S. KEITHLEY	2014-05-25	2	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These block inclusion in Ubuntu Main repo. AFAICT these are false positives: [rpc/rpc-transport/rdma/src/rdma.c:3074]: (error) Address of local auto-variable assigned to a function parameter. [xlators/features/marker/utils/src/gsyncd.c:99]: (error) Memory leak: str [xlators/features/marker/utils/src/gsyncd.c:354]: (error) Memory leak: argv [xlators/nfs/server/src/nlm4.c:1176]: (error) Possible null pointer dereference: fde The remainder are fixed with this change-set: [api/src/glfs-fops.c:700]: (error) Possible null pointer dereference: gio [api/src/glfs-fops.c:702]: (error) Possible null pointer dereference: frame [xlators/cluster/afr/src/afr-inode-write.c:375]: (error) Possible null pointer dereference: frame [xlators/cluster/afr/src/afr-self-heal-common.c:1522]: (error) Possible null pointer dereference: local [xlators/cluster/dht/src/dht-rebalance.c:1574]: (error) Possible null pointer dereference: ctx [xlators/cluster/stripe/src/stripe.c:4407]: (error) Possible null pointer dereference: local [xlators/mgmt/glusterd/src/glusterd-mountbroker.c:675]: (error) Possible null pointer dereference: cookieswitch [xlators/mgmt/glusterd/src/glusterd-mountbroker.c:677]: (error) Possible null pointer dereference: cookieswitch [xlators/mgmt/glusterd/src/glusterd-replace-brick.c:924]: (error) Resource leak: file [xlators/mgmt/glusterd/src/glusterd-replace-brick.c:1008]: (error) Resource leak: file [xlators/mgmt/glusterd/src/glusterd-sm.c:248]: (error) Possible null pointer dereference: new_ev_ctx [xlators/mgmt/glusterd/src/glusterd-store.c:1250]: (error) Possible null pointer dereference: handle [xlators/mgmt/glusterd/src/glusterd-utils.c:4272]: (error) Possible null pointer dereference: this [xlators/mgmt/glusterd/src/glusterd-utils.c:5113]: (error) Possible null pointer dereference: this [xlators/mount/fuse/src/fuse-bridge.c:4432]: (error) Uninitialized variable: finh [xlators/mount/fuse/src/fuse-bridge.c:2927]: (error) Possible null pointer dereference: state [xlators/mount/fuse/src/fuse-bridge.c:3226]: (error) Possible null pointer dereference: state [xlators/storage/bd_map/src/bd_map.c:1504]: (error) Possible null pointer dereference: bd_fd [xlators/storage/bd_map/src/bd_map.c:1728]: (error) Possible null pointer dereference: n_entry [xlators/storage/bd_map/src/bd_map.c:1741]: (error) Possible null pointer dereference: n_entry [xlators/performance/quick-read/src/quick-read.c:585]: (error) Possible null pointer dereference: iobuf rerunning cppcheck --force afterwards: Test code, don't care: [extras/test/test-ffop.c:27]: (error) Buffer overrun possible for long command line arguments. False positive after fix [xlators/cluster/stripe/src/stripe.c:4407]: (error) Possible null pointer dereference: local Still false positive: [xlators/features/marker/utils/src/gsyncd.c:354]: (error) Memory leak: argv [xlators/nfs/server/src/nlm4.c:1176]: (error) Possible null pointer dereference: fde Not built, don't care: [xlators/cluster/ha/src/ha.c:2699]: (error) Possible null pointer dereference: priv Change-Id: I1fb849e9c042d3a3701cb05121d413e58e73d505 BUG: 1086460 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/7583 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	dht: dht_lookup_dir_cbk should set op_errno as local->op_errnov3.4.3beta1	shishir gowda	2014-03-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two glusterfs clients return inconsistent errnos when the bricks of the volume were down. Consider two gluster mounts. Mount 1 was done when the bricks were online. Mount 2 was done after the bricks were killed, (using the 'glusterfs' command instead of the mount script). For any request, mount 1 will return ENOTCONN, where as mount 2 will return ENOENT. This happens because for the 2nd mount, a fuse would send a lookup on '/' for any request, as it hadn't been done yet. The client xlator returns ENOTCONN, but the dht_lookup_dir_cbk changed this to ENOENT unconditionally when aggregating. So, fuse returned ENOENT, even though the errno should have been ENOTCONN. backporting http://review.gluster.org/6072 BUG: 1019095 Change-Id: Iaa40dffefddfcaf1ab7736f5423d7f9d2ece1363 Original-author: Kaushal M <kaushal@redhat.com> Signed-off-by: shishir gowda <gowda.shishir@gmail.com> Reviewed-on: http://review.gluster.org/6471 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/dht: Make sure loc has gfid	Pranith Kumar K	2014-01-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In some code paths neither loc->gfid nor loc->inode->gfid is populated which leads to EINVAL for linkfile setattr in dht_linkfile_attr_heal. Fix: Populate loc->gfid before dht_linkfile_attr_heal. BUG: 971805 Change-Id: I8e4b7510ee5c38aa9ccf5283c7165c7df25ec62b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6691 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: Ignore ENOENT errors for unlink of linkfiles	Anand Avati	2013-12-24	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.org/4971 If unlink of linkfile returns ENOENT, do not fail unlink. Proceed with unlinking of cached file. Change-Id: If7cec92b40c39d68dd9c3606c6c2c3a6bd67d27b BUG: 966848 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6586 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Revert "core: fix errno for non-existent GFID"	Vijay Bellur	2013-12-24	5	-12/+8
\| \| \| \| \| \| \| \| \|	This reverts commit 837422858c2e4ab447879a4141361fd382645406 Change-Id: I0909f26ce088454bb14b3694b489c672286a4ae6 Reviewed-on: http://review.gluster.org/6575 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: interim fix for reverting 837422858c	Vijay Bellur	2013-12-24	1	-1/+1
\| \| \| \| \| \| \| \|	Change-Id: I74818a03f7c5d7891561515af2fa35ea3775255c BUG: 1032894 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6582 Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/dht: Ignore decommissioned subvol in overlap optimization	shishir gowda	2013-12-16	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: Ib727948c6e21b19fd509f258ff0aea1c5d1a84d1 BUG: 966845 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5056 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/6517 Reviewed-by: Shishir Gowda <gowda.shishir@gmail.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: Fix anomaly checkv3.4.2qa3	shishir gowda	2013-12-14	1	-3/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were wrongly detecting holes/overlaps for already accounted errors. Additionally, sort should also handle zero'ed out layout Change-Id: Ic3d13e1d735b914f9acc01fe919bc90656baea48 BUG: 1003851 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5762 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6469 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: Del GF_READDIR_SKIP_DIRS key from dict for first_up	shishir gowda	2013-12-14	2	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we sent GF_READDIR_SKIP_DIRS for all subvolumes if first_subvol != first_up_subvolume. Also first_up_subvolume can change with-in the life of a call and cbk. Saving the first_up_subvol in dht_local for checks. Back porting fix http://review.gluster.org/5577 BUG: 996474 Change-Id: I67b5bbe781e12812557b569b7d0a0beba4224159 Signed-off-by: shishir gowda <gowda.shishir@gmail.com> Reviewed-on: http://review.gluster.org/6468 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: Do migration inprog/complete check only if ENOENT	Anand Avati	2013-12-13	2	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Additionally, update op_errno to the lasted failure. If failures found in complete_check, error returned would be EUCLEAN instead of the right failure (in this case ENOENT) Change-Id: Ib813867f4b817af651627b9ea07b0b09fa2b26ce BUG: 966852 Original-author: shishir gowda <sgowda@redhat.com> Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: set layout in inode ctx even if linkfile fails	shishir gowda	2013-12-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Creating linkfile could have failed, but we dont care about linkfile for setting layout in the inode ctx (could be EEXIST etc.) So ignore @inode in cbk and pick it up from local->loc.inode Backporting http://review.gluster.org/6319 BUG: 1032859 Change-Id: Ic95e303a4c060900d041820d4faa68d1c4685b6a Original-author: Anand Avati <avati@redhat.com> Signed-off-by: shishir gowda <gowda.shishir@gmail.com> Reviewed-on: http://review.gluster.org/6470 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	distribute: Rebalance should provide even disk space distribution	shishir gowda	2013-12-10	1	-15/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier disk space check had an issue which didn't provide the needed functionality to avoid migration when the destination had lesser available space, scenario we need to avoid is stated below : During rebalance `migrate-data` - Destination subvol experiences a `reduction` in 'blocks' of free space, at the same time source subvol gains certain 'blocks' of free space. A valid check is necessary here to avoid errorneous move to destination where the space could be scantily available. This patch provides a proper fix in place by subtracting necessary file blocks from destination and adding those blocks to source. backporting fix http://review.gluster.org/5961 BUG: 982919 Original-author: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: shishir gowda <gowda.shishir@gmail.com> Change-Id: If5808eaa89e66d7bcaeee7268fe3fe5b1b56f51d Signed-off-by: shishir gowda <gowda.shishir@gmail.com> Reviewed-on: http://review.gluster.org/6461 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/dht: handle NULL check before strlen/strcmp in fgetxattr	Anand Avati	2013-12-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	@key can legally be NULL. Handle that case without crashing. Change-Id: Iaae293caa7eeb24afc9cd2580799173e2ce00911 BUG: 1036879 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6402 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: set layout in inode ctx even if linkfile fails	Anand Avati	2013-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Creating linkfile could have failed, but we dont care about linkfile for setting layout in the inode ctx (could be EEXIST etc.) So ignore @inode in cbk and pick it up from local->loc.inode Change-Id: I2952799d7ae0d3441b84b2ca2981afd75d7576e2 BUG: 1032859 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6358 Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	core: fix errno for non-existent GFID	Anand Avati	2013-11-26	5	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When clients refer to a GFID which does not exist, the errno to be returned in ESTALE (and not ENOENT). Even though ENOENT might look "proper" most of the time, as the application eventually expects ENOENT even if a parent directory does not exist, not returning ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution in uncached mode. This can result in spurious ENOENTs during concurrent path modification operations. Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936 BUG: 1032894 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6322 Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/dht: Treat migration failures due to space constraints as skippedv3.4.1qa3	shishir gowda	2013-09-19	2	-5/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently rebalance/remove-brick op's display migration failed count even for files which failed due to space issues (not enough space for file, or migration leading to cluster imbalance) These will now be counted as skipped, and rebalance/remove-brick status will display the additional counter BUG: 989846 Change-Id: I4efa7ce69dd43680ff47181afed0c561954c5080 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5977 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/dht: Ignore subvols with error in min-free-disk/inodes	Amar Tumballi	2013-09-10	5	-17/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when selecting a alternative subvolume when hashed subvol has exceeded min-free-disk/inodes, we do not check if layouts have errors (including decommissioning). This leads to data being written to those subvolumes, and in case of decommissioning, will lead to data loss. BUG: 982919 > Original-Author: shishir gowda <sgowda@redhat.com> > Reviewed-on: http://review.gluster.org/5299 Change-Id: If301a86cf3ca5fad6529bd2e61382f9901663ba0 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5888 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: assign layout onto missing directories too	Anand Avati	2013-09-10	1	-4/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current self-healing algorithm is ignoring missing directories for assigning new layout. When lookup() is racing against mkdir() or when self-healing a half-done mkdir(), the layout assignment split must happen based on the final number of directories, and not the currently existing number of directories (because we finish mkdir() of missing directories before hash layout assignment). Without this fix, concurrent mkdir() and lookup() will step on each others feet, create a messed up layout on disk, and end up with different in-memory layouts. Once two clients have different in-memory layouts, creation of subdirectory will not arbitrate on the same hashed subvolume and will result in GFID mismatch of the sub-directory. Change-Id: Ia47acad67c265060405984c822b4d37512b9dbb3 BUG: 907072 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5871 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Fix crash in dht_migration_complete_check_task because of NULL fd	Emmanuel Dreyfus	2013-06-08	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	This is a backport of Ia5a5d40bcea7bfb320ef7096af1e035b8847d4ff BUG: 960055 Change-Id: Ibf3547a775d7ca2f3a097c880cdf38ffafb322da Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/5139 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	dht,posix: support for case discovery	Anand Avati	2013-06-08	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is support for discovering a filename in a given directory which has a case insensitive match of a given name. It is implemented as a virtual extended attribute on the directory where the required filename is specified in the key. E.g: sh# getfattr -e "text" -n user.glusterfs.get_real_filename:FiLe-B /mnt/samba/patchy getfattr: Removing leading '/' from absolute path names # file: mnt/samba/patchy user.glusterfs.get_real_filename:FiLe-B="file-b" In reality, there can be multiple "answers" as the backend filesystem is case sensitive and there can be multiple files which can strcasecamp() successfully. In this case we pick the first matched file from the first responding server. If a matching file does not exist, we return ENOENT (and NOT ENODATA). This way the caller can differentiate between "unsupported" glusterfs API and file not existing. This API is used by Samba VFS to perform efficient discovery of the real filename without doing a full scan at the Samba level. Change-Id: I53054c4067cba69e585fd0bbce004495bc6e39e8 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5163 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	gfapi: link inodes in relevant entry FOPs	Anand Avati	2013-06-08	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Do not let inode linking to happen only in lookup(). While that works, it is inefficient. Change-Id: I51bbfb6255ec4324ab17ff00566375f49d120c06 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterd: Backport of vme table changes from master	Kaushal M	2013-06-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch backports the following changes from the master branch 99fe09f glusterd: Moved the volume entry table to a separate file. e306d08 glusterd: Changing the volume entry table's representation. eac54f6 glusterd: Added option description, and validation function fields. bcb4235 glusterd: Added validation function for performance cache max and min size. 8897d08 glusterd: Added validation function for quota-timeout. 4579609 glusterd: Added validation function for stripe-block-size. 6788bad glusterd: Fix some options in vme table 549231d glusterd: Added the validation function for subvols-per-directory 9636e63 glusterd: Added description for nfs.transport-type option in volume set help. Change-Id: I4a64ad94f17df4b45a3a32262a83e2c35fb5f7da BUG: 907311 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4956 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/distribute: Ignore non-participating subvols for layout checks	shishir gowda	2013-04-11	2	-20/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backporting fix http://review.gluster.org/#/c/4668/ When subvols-per-directory is < available subvols, then there are layouts which are not populated. This leads to incorrect identification of holes or overlaps. We need to ignore layouts, which have err == 0, and start == stop. In the current scenario (start == stop == 0). Additionally, in layout-merge, treat missing xattrs as err = 0. In case of missing layouts, anomalies will reset them. For any other valid subvoles, err != 0 in case of layouts being zeroed out. Also reverted back dht_selfheal_dir_xattr, which does layout calculation only on subvols which have errors. BUG: 921408 Change-Id: I75a8edcb92af5b53b3253c9addd7a812e9242836 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4800 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	dht: improve transform/detransform of d_off (and be ext4 safe)	shishir gowda	2013-04-11	1	-5/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backporting Avati's fix http://review.gluster.org/4711 The scheme to encode brick d_off and brick id into global d_off has two approaches. Since both brick d_off and global d_off are both 64-bit wide, we need to be careful about how the brick id is encoded. Filesystems like XFS always give a d_off which fits within 32bits. So we have another 32bits (actually 31, in this scheme, as seen ahead) to encode the brick id - which is typically plenty. Filesystems like the recent EXT4 utilize the upto 63 low bits in d_off, as the d_off is calculated based on a hash function value. This leaves us no "unused" bits to encode the brick id. However both these filesystmes (EXT4 more importantly) are "tolerant" in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the "closest" true offset. This "two-prong" scheme exploits this behavior - which seems to be the best middle ground amongst various approaches and has all the advantages of the old approach: - Works against XFS and EXT4, the two most common filesystems out there. (which wasn't an "advantage" of the old approach as it is borken against EXT4) - Probably works against most of the others as well. The ones which would NOT work are those which return HUGE d_offs _and_ NOT tolerant to seekdir() to "closest" true offset. - Nothing to "remember in memory" or evict "old entries". - Works fine across NFS server reboots and also NFS head failover. - Tolerant to seekdir() to arbitrary locations. Algorithm: Each d_off can be encoded in either of the two schemes. There is no requirement to encode all d_offs of a directory or a reply-set in the same scheme. The topmost bit of the 64 bits is used to specify the "type" of encoding of this particular d_off. If the topmost bit (bit-63) is 1, it indicates that the encoding scheme holds a HUGE d_off. If the topmost bit is is 0, it indicates that the "small" d_off encoding scheme is used. The goal of the "small" d_off encoding is to stay as dense as possible towards the lower bits even in the global d_off. The goal of the HUGE d_off encoding is to stay as accurate (close) as possible to the "true" d_off after a round of encoding and decoding. If DHT has N subvolumes, we need ROOF(Log2(N)) "bits" to encode the brick ID (call it "n"). SMALL d_off =========== Encoding -------- If the top n + 1 bits are free in a brick offset, then we leave the top bit as 0 and set the remaining bits based on the old formula: hi_mask = 0xffffffffffffffff hi_mask = ~(hi_mask >> (n + 1)) if ((hi_mask & d_off_brick) != 0) do_large_d_off_encoding () d_off_global = (d_off_brick * N) + brick_id Decoding -------- If the top bit in the global offset is 0, it indicates that this is the encoding formula used. So decoding such a global offset will be like the old formula: if ((d_off_global & 0x8000000000000000) != 0) do_large_d_off_decoding() d_off_brick = (d_off_global % N) brick_id = d_off_global / N HUGE d_off ========== Encoding -------- If the top n + 1 bits are NOT free in a given brick offset, then we set the top bit as 1 in the global offset. The low n bits are replaced by brick_id. low_mask = 0xffffffffffffffff << n // where n is ROOF(Log2(N)) d_off_global = (0x8000000000000000 \| d_off_brick & low_mask) + brick_id if (d_off_global == 0xffffffffffffffff) discard_entry(); Decoding -------- If the top bit in the global offset is set 1, it indicates that the encoding formula used is above. So decoding would look like: hi_mask = (0xffffffffffffffff << n) low_mask = ~(hi_mask) d_off_brick = (global_d_off & hi_mask & 0x7fffffffffffffff) brick_id = global_d_off & low_mask If "losing" the low n bits in this decoding of d_off_brick looks "scary", we need to realize that till recently EXT4 used to only return what can now be expressed as (d_off_global >> 32). The extra 31 bits of hash added by EXT recently, only decreases the probability of a collision, and not eliminate it completely, anyways. In a way, the "lost" n bits are made up by decreasing the probability of collision by sharding the files into N bricks / EXT directories -- call it "hash hedging", if you will :-) Change-Id: I9551c581c3f3d4c9e719764881036d554f60c557 Thanks-to: Zach Brown <zab@redhat.com> BUG: 838784 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4799 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/distribute: Fix layout overlaps due to spread-count in selfheal path	shishir gowda	2013-03-09	1	-50/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We needed to zero out the layout range, before we re-calculate the range. When spread-count is issued, we would end up with stale ranges in the layout. Replaced dht_selfheal_dir_xattr with dht_fix_dir_xattr, which correctly resets the un-used (after re-cal) layouts. Change-Id: I1a900d15df07335f59356bd23182ccec34381ab2 BUG: 884455 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4648 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/distribute: Reopen fds in migration internally as root:root	shishir gowda	2013-03-04	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Though linkfile_create and rebalance dst file create sent a setattr with correct ownership, there is still a race window where the linkfile open (client open due to migration) will fail, as its ownership will be root:root. BUG: 884597 Change-Id: Iba73681eae4f280d39ee6c9a40009e195768bee7 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4612 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/distribute: Prevent spurious multiple defrag crawls	shishir gowda	2013-03-04	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In dht_notify, we used to create a thread to start defrag crawls after we had heard from all child subvols. This was in-correct, as a later event, could also trigger the crawl again(due to the fact that all subvols had responded). The fix is to make sure, the thread is started only once after all subvols have responded the first time BUG: 916449 Change-Id: I1619344fbb1cb51d5e1db38d8a29821fa870fa8b Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4610 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/distribute: Preserve file size during rebalance migration	shishir gowda	2013-03-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If holes are encountered, then we do not write these to the dst, which sometimes causes file size to be lesser than src. Data is not corrupted, as when non-zero reads are received, we do write that data. Calling a truncrate to give file size to prevent it from being truncated to less than src in case the file end has holes. Thanks to Brian Foster for providing the test case BUG: 915554 Change-Id: I7e1e0c475118b073c3ebb87e93220c1ec22e8b7d Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4609 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/distribute: Remove suprious fd_unref call	shishir gowda	2013-03-04	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	After fix http://review.gluster.org/4282 (libglusterfsterfs/syncop: do not hold ref on the fd in cbk) was pushed, syncop_open does not take a ref anymore. BUG: 910661 Change-Id: Idedff91270966e6e70e71ee83785c0228e238d31 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4608 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/dht: Create linkfile with file uid/gid	shishir gowda	2013-03-04	4	-4/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, linkfile creation happens as root. use uid/gid returned from _cbk (link/rename) to set the correct ownership of the link files. Also added test/dht.rc to implement common dht functions BUG: 884597 Change-Id: I6bc0e04f62d4716fc033681e5678e852a1be7a2f Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4607 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/dht: pathinfo xattr changes for directories	Venky Shankar	2013-02-08	2	-92/+224
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since directories have presence on all subvolumes there is no definite meaning of ->hashed_subvol or ->cached_subvol. getxattr() code path chooses ->cached_subvol for pathinfo extended attribute. While this makes sense of files, it makes less sense for directories. Further if a hashed or a cached subvolume is down, and there's a getxattr request for a directory, we return with an errno. This patch changes pathinfo extended attribute contents by aggregating information from all subvolumes that are up. Change-Id: I58adb741d63ccfd1d0239af75eb65f26f0fb384d Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 856455 Reviewed-on: http://review.gluster.org/4047 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	Use proper libtool option -avoid-version instead of bogus -avoidversion	Anand Avati	2013-02-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Change-Id: I1c9541058c7d07786539a3266ca125a6a15287d8 BUG: 859835 Signed-off-by: Anand Avati <avati@redhat.com> Original-author: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com> Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius.kk@gmail.com> Reviewed-on: http://review.gluster.org/3967 Tested-by: Gluster Build System <jenkins@build.gluster.com>