glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	features/marker: Filter internal xattrs in lookup	Pranith Kumar K	2015-02-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.com/9061 Afr should ignore quota-size-key as part of self-heal but should heal quota-limit key. BUG: 1162230 Change-Id: I639cfabbc44468da29914096afc7e2eca1ff1292 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9091 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: serialize inode locks	Pranith Kumar K	2015-02-11	1	-76/+220
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.com/9372 Problem: Afr winds inodelk calls without any order, so blocking inodelks from two different mounts can lead to dead lock when mount1 gets the lock on brick-1 and blocked on brick-2 where as mount2 gets lock on brick-2 and blocked on brick-1 Fix: Serialize the inodelks whether they are blocking inodelks or non-blocking inodelks. Non-blocking locks also need to be serialized. Otherwise there is a chance that both the mounts which issued same non-blocking inodelk may endup not acquiring the lock on any-brick. Ex: Mount1 and Mount2 request for full length lock on file f1. Mount1 afr may acquire the partial lock on brick-1 and may not acquire the lock on brick-2 because Mount2 already got the lock on brick-2, vice versa. Since both the mounts only got partial locks, afr treats them as failure in gaining the locks and unwinds with EAGAIN errno. Change-Id: I939a1d101e313a9f0abf212b94cdce1392611a5e BUG: 1177928 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9374 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: When parent and entry read subvols are different, set ↵	Krutika Dhananjay	2015-02-05	1	-1/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	entry->inode to NULL Backport of: http://review.gluster.org/#/c/9477 That way a lookup would be forced on the entry, and its attributes will always be selected from its read subvol. Additionally, directory write fops as well as LOOKUP have been made to unwind parent attributes from parent's read child in AFR. Change-Id: I9fca49fa91cc3a65f53db855fedb90b08f1ca7f4 BUG: 1186121 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9504 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/afr : Prevent excessive logging of split-brain messages.	Anuradha	2014-11-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running the volume heal info command would result in excessive logging of split-brain messages. After this patch, running heal info command will not log the split brain messages. This info is now displayed in the output of heal info command instead. If a file is in split-brain, a message "Is in split-brain" will be written against its name. Change-Id: Ib8979be04f5ac7c59ce3ad1185886bb54b8be808 BUG: 1161102 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9069 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Fix xattr heal comparison checks	Pranith Kumar K	2014-11-13	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of part of the fixes in http://review.gluster.org/8558 Problem: While implementing list-xattr based meta-data self-heal for afr-v2 we found 2 issues, with afr-v1's implementation. 1) change in QUOTA_SIZE_KEY xattr value can trigger spurious metadata self-heal. 2) xattr comparison function that is implemented for afr-v1 checks if the number of xattrs in both the xattrs is same and then checks that the xattrs present in brick-1's response are present and equal. But what we observed me was that count also contains the gluster internal/virtual xattrs where as the compare function should only compare on-disk external xattrs that can be healed. So the correct implementation should check that the external xattrs in first brick's response are present in second brick's response and vide versa. Fix: This patch is partly backported from afr-v2's implementation. Will be providing the links where necessary. 1) Added QUOTA_SIZE_KEY xattr to the list of xattrs that need to be ignored. (http://review.gluster.org/#/c/8558/10/xlators/cluster/afr/src/afr-common.c line: 1155) 2) For xattrs to be equal, check all keys in xattr-dict1 are in xattr-dict2 and equal and vice versa. (http://review.gluster.org/#/c/8558/10/xlators/cluster/afr/src/afr-common.c line: 1195) Change-Id: I63aa74858c6f608b98d1fe425b3fa56f925bb5b3 BUG: 1162230 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9090 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	afr : Logging improvement	Anuradha	2014-11-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of a split brain, adding the type of split brain that might have occurred. Added a few details to entry-self-heal in self-heal completion status. Change-Id: Ie99e2ecdd8aa5b1c57d7d4515d33a17dfa0c67ad BUG: 1101138 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/7870 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Fix sizeof typo	Pranith Kumar K	2014-10-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	Change-Id: Ib82a1c4967f0880c91c114e4baae08bdbe77bb60 BUG: 1153626 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8935 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	Only cleanup priv->shd.statistics if created	Tiziano Müller	2014-10-07	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible that the statistics array was never created and dereferencing it may case a segfault. BUG: 1147156 Change-Id: If905457ba985add62c3ed543bced1313640af762 Signed-off-by: Tiziano Müller <tiziano.mueller@stepping-stone.ch> Reviewed-on: http://review.gluster.org/8873 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Launch self-heal only when all the brick status is known	Pranith Kumar K	2014-10-01	1	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: File goes into split-brain because of wrong erasing of xattrs. RCA: The issue happens because index self-heal is triggered even before all the bricks are up. So what ends up happening while erasing the xattrs is, xattrs are erased only on the sink brick for the brick that it thinks is up leading to split-brain Example: lets say the xattrs before heal started are: brick 2: trusted.afr.vol1-client-2=0x000000020000000000000000 trusted.afr.vol1-client-3=0x000000020000000000000000 brick 3: trusted.afr.vol1-client-2=0x000010040000000000000000 trusted.afr.vol1-client-3=0x000000000000000000000000 if only brick-2 came up at the time of triggering the self-heal only 'trusted.afr.vol1-client-2' is erased leading to the following xattrs: brick 2: trusted.afr.vol1-client-2=0x000000000000000000000000 trusted.afr.vol1-client-3=0x000000020000000000000000 brick 3: trusted.afr.vol1-client-2=0x000010040000000000000000 trusted.afr.vol1-client-3=0x000000000000000000000000 So the file goes into split-brain. Change-Id: I79f9a289d2118a715d262398221037b684a53d2a BUG: 1142614 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8757 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Handle EAGAIN properly in inodelk	Pranith Kumar K	2014-09-29	1	-14/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When one of the brick is taken down and brough back up in a replica pair, locks on that brick will be allowed. Afr returns inodelk success even when one of the bricks already has the lock taken. Fix: If any brick returns EAGAIN return failure to parent xlator. Note: This change only works for non-blocking inodelks. This patch addresses dht-synchronization which uses non-blocking locks for rename. Blocking lock is issued by only one of the rebalance processes. So for now there is no possibility of deadlock. Change-Id: I07673f8873263da334e03f35c6cdb5db9410a616 BUG: 1141733 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8739 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Fix leaks in self-heal code path	Pranith Kumar K	2014-07-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I5301ec9ebac27afe52e85cad75e6395d7f891355 BUG: 1120151 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8316 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Fix resolution issues with afr	Pranith Kumar K	2014-06-24	1	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem with afr: Lets say there is a directory hierarchy a/b/c/d on the mount and the user is cd'ed into the directory. Bring down one of the bricks of replica and remove all directories/files to simulate disk replacement on that brick. Now this brick is brought back up. Creates on the cd'ed directory fail with ESTALE. Basically before sending a create of 'f' inside 'd', fuse sends a lookup to make sure the file is not present. On one of the bricks 'd' is present and 'f' is not so it sends ENOENT as response. On the new brick 'd' itself is not present. So it sends ESTALE. In afr ESTALE is considered to be special errno on witnessing which lookup has to fail. And ESTALE is given more priority than ENOENT. Due to these reasons lookup fails with ESTALE rather than ENOENT. Since lookup didn't fail with ENOENT, 'create' can't be issued so the command is failed with ESTALE. Solution: Afr needs to consider ESTALE errno normally and ENOENT needs to be given more priority so that operations like create can proceed even when only one of the brick is up and running. Whenever client xlator identifies that gfid-changed, it sets that information in lookup xdata. Afr uses this information to fail the lookup with ESTALE so that top xlator can send fresh lookup. Change-Id: Ie8e0e327542fd644409eb5dadf451679afa1c0e5 BUG: 1112348 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8154 Tested-by: Justin Clift <justin@gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Remove eager-lock stub on finodelk failure	Pranith Kumar K	2014-05-14	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: For write fops afr's transaction eager-lock init adds transactions that can share eager-lock to fdctx list. But if eager-lock finodelk fop fails the stub remains in the list. This could later lead to corruption of the list and lead to infinite loop on the list leading to a mount hang. Fix: Remove the stub when finodelk fails. Change-Id: Ic9d1368907c32edb4ea2e6db623e869e4f50180d BUG: 1063190 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7748 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: Fix bugs in quorum implementation	Pranith Kumar K	2014-05-10	1	-37/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Have common place to perform quorum fop wind check - Check if fop succeeded in a way that matches quorum to avoid marking changelog in split-brain. Change-Id: I663072ece0e1de6e1ee9fccb03e1b6c968793bc5 BUG: 1066996 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7513 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/afr: perform list-xattr during lookup	Ravishankar N	2014-04-28	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Detect and heal mismatching user extended attributes during lookup. Depends on: http://review.gluster.org/#/c/7434/ Change-Id: I49410aafd319ac159fdf9e6f9201871bbf2f67bd BUG: 1078061 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/7444 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Prevent heal info hang when data-self-heal in progress.	Pranith Kumar K	2014-04-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: For determining whether data-self-heal is needed afr takes blocking locks. So if self-heal is indeed in progress on the file, this leads to hangs. heal info hung for almost 50 minutes when a 50G file is undergoing heal. Fix: When self-heal is in progress there is a live self-heal-domain lock. In this stage if a non-blocking inodelk for self-heal-domain lock is performed it will fail with EAGAIN. For heal info we can use this logic to determing that the file is possibly undergoing heal and inform it to user instead of waiting for the completion of self-heal. Change-Id: I18527c59e429602bae49c98ff45502833ab8e1f0 BUG: 1039544 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7482 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	heal: Enable logging for glfsheal.	Pranith Kumar K	2014-04-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	logs will be written to <log-dir>/glfsheal-<volname>.log Moved some non-essential frequent logs to DEBUG. BUG: 1039544 Change-Id: I2aceda6e3092f8c5052e7a4b8b5dec3cdeebd9a9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7481 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: trigger self-heals even when they are set to off.	Pranith Kumar K	2014-04-28	1	-11/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When attempt-self-heal is set to true, trigger data/metadata/entry self-heals even when they are disabled. This is useful for gluster volume heal info to report them even when metadata-self-heal entry-self-heal, data-self-heal are set to off. Change-Id: Idc3f0d5d049c875b4f975248fef56ea2238da47c BUG: 1039544 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7480 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: get virtual-xattrs only on valid xdata.	Pranith Kumar K	2014-04-28	1	-9/+11
\| \| \| \| \| \| \| \| \| \|	Change-Id: I61fd891435faced25b2bdf617ec07a8af8ef057d BUG: 1046853 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6605 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Add dry-run functionality to self-heal.	Pranith Kumar K	2014-04-28	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	This will be useful in figuring out if a file needs self-heal or not with certainity for data-self-heal Change-Id: Idf98a68e69f2c35646ef2e7c97302586fe1dc07d BUG: 1039544 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6510 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Add foreground self-heal launch capability through lookup.	Pranith Kumar K	2014-04-28	1	-17/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Also renamed allow-sh-for-running-transaction -> attempt-self-heal. Change-Id: I134cc79e663b532e625ffc342c59e49e71644ab3 BUG: 1039544 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6509 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Treat ESTALE on nameless lookup as ENOENT	Pranith Kumar K	2014-01-27	1	-1/+3
\| \| \| \| \| \| \| \| \|	Change-Id: I635fc0fa955b33590f1c5b4dfec22d591ea8575c BUG: 1032894 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6593 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Remove 'max' from the log	Pranith Kumar K	2013-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch avoids giving more info to the user about the internal heuristic employed in afr, for quota sizes. Change-Id: Ice3a164399f09b6967500ec0c17dc340e7ae9aba BUG: 1016683 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6098 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr : Implementation of command "gluster volume heal vn statistics"	Venkatesh Somyajulu	2013-10-14	1	-3/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"gluster volume heal volumename statistics" command gives the summary of the afr crawl done based on the entries present in the xattrop directory. Whenever afr crawls are attempted, the beginning time of crawl, end time of crawl, no of files healed, heal-failed count and number of files in split brain are shown along with the type of the crawl. If crawl is already in progress then it will give the number of files healed, heal failed count and number of files in split-brain from the beginning of the crawl and instead of telling the end time of the crawl, "CRAWL IN PROGRESS" message will be shown. Output format: command: "gluster volume heal volume-name statistics" Output: Gathering afr crawl statistics crawl statistics on volume volume-name has been successful ------------------------------------------------ Crawl statistics for brick no 0 Hostname of brick 192.168.122.248 Starting time of crawl: Wed Jul 10 15:52:38 2013 Ending time of crawl: Wed Jul 10 15:52:38 2013 Type of crawl: INDEX No. of entries healed: 0 No. of entries in split-brain: 0 No. of heal failed entries: 0 Starting time of crawl: Wed Jul 10 15:52:38 2013 Ending time of crawl: Wed Jul 10 15:52:38 2013 Type of crawl: INDEX No. of entries healed: 0 No. of entries in split-brain: 0 No. of heal failed entries: 0 ------------------------------------------------ Crawl statistics for brick no 1 Hostname of brick 192.168.122.1 Starting time of crawl: Wed Jul 10 15:52:42 2013 Ending time of crawl: Wed Jul 10 15:52:42 2013 Type of crawl: INDEX No. of entries healed: 0 No. of entries in split-brain: 0 No. of heal failed entries: 0 Starting time of crawl: Wed Jul 10 15:52:42 2013 Ending time of crawl: Wed Jul 10 15:52:42 2013 Type of crawl: INDEX No. of entries healed: 0 No. of entries in split-brain: 0 No. of heal failed entries: 0 -------------------------------------------------- Change-Id: I10bf9d10b005741db9973fb1352e0dd59ed99aa9 BUG: 949400 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4790 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Handle quota size xattr separately in lookup	Pranith Kumar K	2013-10-10	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quota size xattrs are not maintained by afr. There is a possibility that they differ even when both the directory changelog xattrs suggest everything is fine. So if there is at least one 'source' check among the sources which has the maximum quota size. Otherwise check among all the available ones for maximum quota size. This way if there is a source and stale copies it always votes for the 'source'. Change-Id: Ia222379cbafa7043dd03f533c105860f2c7b8b0d BUG: 1016683 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6052 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Varun Shastry <vshastry@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Have common inode-write-fop cbk	Pranith Kumar K	2013-09-18	1	-495/+7
\| \| \| \| \| \| \| \| \| \|	Change-Id: Ia7b324b86d6a7051d187106d7a060155e77defc5 BUG: 910217 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5238 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Improvement in logging of self heal completion status	Venkatesh Somyajulu	2013-08-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Additional information for source and sinks are added. Change-Id: I1704956ff86ac3ae36744efe7499c1d1c43faeaf BUG: 968301 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/5638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Add special handling for failure postops	Pranith Kumar K	2013-08-28	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Idea is to not leave the file in FOOL-FOOL scenario in case on all the bricks data transaction failed with EDQUOT to avoid increasing un-necessary load of self-heals in the system. For directory transactions don't leave pending changelog in case the failures are seen on all the subvolumes. Change-Id: I38a5561d1d581a78347a76a4a509514e4a0c3fb7 BUG: 969461 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5709 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	afr: treat appending writes as stable writes.	Anand Avati	2013-08-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Durability of appending writes is implicit in the file size. Therefore performing an explicit fsync() is unnecessary in such cases as self-heal can check for the size of file when pending changelog is not unambiguous. Change-Id: I05446180a91d20e0dbee5de5a7085b87d57f178a BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	afr: check for non-zero call_count before doing a stack wind	Ravishankar N	2013-08-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When one of the bricks of a 1x2 replicate volume is down, writes to the volume is causing a race between afr_flush_wrapper() and afr_flush_cbk(). The latter frees up the call_frame's local variables in the unwind, while the former accesses them in the for loop and sending a stack wind the second time. This causes the FUSE mount process (glusterfs) toa receive a SIGSEGV when the corresponding unwind is hit. This patch adds the call_count check which was removed when afr_flush_wrapper() was introduced in commit 29619b4e Change-Id: I87d12ef39ea61cc4c8244c7f895b7492b90a7042 BUG: 988182 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5393 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Disable eager-lock if open-fd-count > 1	Pranith Kumar K	2013-08-02	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lets say mount1 has eager-lock(full-lock) and after the eager-lock is taken mount2 opened the same file, it won't be able to perform any data operations until mount1 releases eager-lock. To avoid such scenario do not enable eager-lock for transaction if open-fd-count is > 1. Delaying of changelog piggybacking is avoided in this situation. Change-Id: I51b45d6a7c216a78860aff0265a0b8dabc6423a5 BUG: 910217 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Print self-heal log when self-heal succeeds	Pranith Kumar K	2013-07-31	1	-0/+3
\| \| \| \| \| \| \| \| \|	Change-Id: I95e47e589419dc6a032cbd8ba01964b6c176c2d5 BUG: 927146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5408 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Handle REPLICATE_TRASH_DIR from old bricks	Pranith Kumar K	2013-07-26	1	-0/+7
\| \| \| \| \| \| \| \| \|	Change-Id: Ib99f79d3fa607c818dbc62006516480f598d8add BUG: 886998 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4640 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	afr : change the log level in lookup path to minimize incessant logging.	Ravishankar N	2013-07-08	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change the logging levels from WARNING to DEBUG in the lookup path to minimize incessant logging in case of gfid mismatch errors. Change-Id: I631b16df3249cf826606f547531f985dac696088 BUG: 959083 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/4939 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Refactor inodelk to handle multiple domains	Pranith Kumar K	2013-07-03	1	-7/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- afr_local_copy should not be memduping locked nodes, that would mean that lock is taken in self-heal on those nodes even before it actually takes the lock. So removed memdup code. Even entry lock related copying (lockee info) is also not necessary for self-heal functionality, so removing that as well. Since it is not local_copy anymore changed its name. - My editor changed tabs to spaces. Change-Id: I8dfb92cb8338e9a967c06907a8e29a8404782d61 BUG: 967717 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5099 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: post-op should complete before starting flush	Pranith Kumar K	2013-07-03	1	-18/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the moment afr-flush makes sure that a delayed post-op is woken up but it does not wait for it to complete the post-op before flush unwinds. These are the steps that are happening: 1) flush fop comes on an fd which wakes up a delayed post-op and continues with the flush fop. 2) post-op sends fsync on the wire. 3) flush completes and unwinds to fuse. 4) graph switch happens on the fuse mount disconnecting the old graph's client connections to bricks. 5) xattrop after fsync fails with ENOTCONN because the connections from old graph are taken down now. Fix: Wait for post-op to complete before starting to flush. We could make flush act similar to fsync (i.e.) wind flush as is but wait for post-op to complete before unwinding flush, but it is better to send flush as the final fop. So wind of flush will start after post-op is complete. Had to change fsync to accommodate this change. Change-Id: I93aa642647751969511718b0e137afbd067b388a BUG: 980548 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5274 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	nfs: Remove afr split-brain handling in nfs	Pranith Kumar K	2013-06-25	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We added this code as an interim fix until afr can handle split-brains even when opens are not issued. Afr code has matured to reject fd based fops when there are split-brains so we can remove it. Change-Id: Ib337f78eccee86469a5eaabed1a547a2cea2bdcf BUG: 974972 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5227 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterfs: discard (hole punch) support	Brian Foster	2013-06-13	1	-0/+241
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	gluster: add fallocate fop support	Brian Foster	2013-06-13	1	-0/+244
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	afr: let eager-locking do its own overlap checks	Anand Avati	2013-04-05	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Today there is a non-obvious dependence of eager-locking on write-behind. The reason is that eager-locking works as long as the inheriting transaction has no overlaps with any of the transactions already in progress. While write-behind provides non-overlapping writes as a side-effect most of times (and only guarantees it when strict-write-ordering option is enabled, which is not on by default) eager-lock needs the behavior as a guarantee. This is leading to complex and unwanted checks for the presence of write-behind in the graph, for the simple task of checking for overlaps. This patch removes the interdependence between eager-locking and write-behind by making eager-locking do its own overlap checks with in-progress writes. Change-Id: Iccba1185aeb5f1e7f060089c895a62840787133f BUG: 912581 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4782 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/afr: detect in-progress creation in lookup and return ENOENT	Pranith Kumar K	2013-04-02	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if any subvol returned ENOENT while parent entrylk lock was held, yield and return ENOENT for the entire lookup. This is how the issue happens: Multiple clients A, B and C are attempting 'mkdir -p /mnt/a/b/c' 1 Client A is in the middle of mkdir(/a). It has acquired lock. It has performed mkdir(/a) on one subvol, and second one is still in progress 2 Client B performs a lookup, sees directory /a on one, ENOENT on the other, succeeds lookup. 3 Client B performs lookup on /a/b on both subvols, both return ENOENT (one subvol because /a/b does not exist, another because /a itself does not exist) 4 Client B proceeds to mkdir /a/b. It obtains entrylk on inode=/a with basename=b on one subvol, but fails on other subvol as /a is yet to be created by Client A. 5 Client A finishes mkdir of /a on other subvol 6 Client C also attempts to create /a/b, lookup returns ENOENT on both subvols. 7 Client C tries to obtain entrylk on on inode=/a with basename=b, obtains on one subvol (where B had failed), and waits for B to unlock on other subvol. 8 Client B finishes mkdir() on one subvol with GFID-1 and completes transaction and unlocks 9 Client C gets the lock on the second subvol, At this stage second subvol already has /a/b created from Client B, but Client C does not check that in the middle of mkdir transaction 10 Client C attempts mkdir /a/b on both subvols. It succeeds on ONLY ONE (where Client B could not get lock because of missing parent /a dir) with GFID-2, and gets EEXIST from ONE subvol. This way we have /a/b in GFID mismatch. One subvol got GFID-1 because Client B performed transaction on only one subvol (because entrylk() could not be obtained on second subvol because of missing parent dir -- caused by premature/speculative succeeding of lookup() on /a when locks are detected). Other subvol gets GFID-2 from Client C because while it was waiting for entrylk() on both subvols, Client B was in the middle of creating mkdir() on only one subvol, and Client C does not "expect" this when it is between lock() and pre-op()/op() phase of the transaction. Original-author: Anand Avati <avati@redhat.com> Change-Id: Idca475dbbc2a51e09da6fa0f9e1e37148caef208 BUG: 860210 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4625 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: piggyback and fsync resume changes	Pranith Kumar K	2013-03-28	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) pre_op_piggyback should always be decremented. 2) Move fsync resume to just after post_op. 3) fsync stub should be created from afr's local not from the final response. Change-Id: I220bb532eb03bea584292f4dd2e816ad0c3e0cf7 BUG: 927146 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4741 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: fsync() guarantees POST-OP completion	Anand Avati	2013-03-27	1	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AFR now provides a stronger guarantee that fsync() returns only after completely finishing all the deferred/delayed POST-OP on that open file. To acheive this we make a stub out of the returning fsync and register it with the "delayed" frame in afr_changelog_wake_resume(). The delayed frame, after getting woken up and finishing the POST-OP will call_resume() the registered stub (which UNWINDs the fsync) at the time of frame destruction. This provides a guarantee that an application's (or FUSE) fsync() returns only after finishing up all the previous transactions, including delayed POST-OPs and UNLOCK. Change-Id: Iaa955457e2f25088a144fde37ad0444277b5cf49 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4737 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/afr: ensure DATA operations are made durable before POST-OP	Anand Avati	2013-03-27	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changelogging scheme of AFR stores information about the state of all replicas in all replicas (in the extended attribute of the respective files on each server) in the form of 'pending counts' of operations (effectively "dirty flags"). These xattrs are blindly trusted while performing self-heal, and therefore utmost care has to be taken while updating and maintaing them. The most critical updation is the clearing of the pending counts corresponding to the other server in the changelog of a given server. Before clearing the pending count, we need durability guarantee of the write which was performed on the other server. To obtain such a guarantee, it may be necessary to explicitly introduce an fsync() phase (if the file itself wasn't already opened with O_SYNC). This patch introduces the detection of unstable stable writes on a file and issues explicit fsync() on the servers before performing the POST-OP clearing of pending flags. Change-Id: I2171b86a74ec91e40e5877eef0a4e7379578ecf7 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4721 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	nfs, afr: Fail lookup only on split-brain	Pranith Kumar K	2013-03-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Icee9772f1f1bf5336eb82a4dc13e198424cd4a65 BUG: 921996 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4699 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/afr: Turn on eager-lock for fd DATA transactions	Pranith Kumar K	2013-03-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: With the present implementation, eager-lock is issued for any fd fop. eager-lock is being transferred to metadata transactions. But the lk-owner is set to local->fd address only for DATA transactions, but for METADATA transactions it is frame->root. Because of this unlock on the eager-lock fails and rebalance hangs. Fix: Enable eager-lock for fd DATA transactions Change-Id: If30df7486a0b2f5e4150d3259d1261f81473ce8a BUG: 916226 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4588 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Don't queue transactions during open-fd fix	Pranith Kumar K	2013-02-22	1	-19/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before Anonymous fds are available, afr had to queue up transactions if the file is not opened on one of its subvolumes. This happens until the attempt to open the file either succeeds or fails. These attempts happen until the file is successfully opened on the subvolume. Now client xlator uses anonymous fds to perform the fops if the fd used for the fop is not 'opened'. Fops will be successful even when the file is not opened so there is no need to queue up the transactions anymore in afr. Open is attempted on the subvolume where it is not opened independent of the fop. Change-Id: Id1a4b4ebe6f89f9efe8f6a8247918b91247d0819 BUG: 913051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4568 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: do complete split-brain check in all the fd based fops	Raghavendra Bhat	2013-02-19	1	-15/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fd based operations such as readv checked only for data split brain instead of complete split-brain (i.e both data + metadata) assuming that open would have done the complete split-brain check. However open-behind would have unwound open, without winding to afr thus preventing the complete split-brain check and some appliations will be able to read the contents of the file even though the file has metadata split-brain. So let all the fd based fops do a defensive check of complete split-brain. Change-Id: Ia90b35f2b08426dfcad804b7f8105278c86fbd2d BUG: 846240 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4548 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: Avoid priv->eager_lock value update race	Pranith Kumar K	2013-02-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Change-Id: I7049c0c64e36a9dfa4cc0e0b34de7ec111d2f6c1 BUG: 908302 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4076 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	cluster/afr: wakeup delayed post op on fsync	Pranith Kumar K	2013-01-29	1	-5/+3
\| \| \| \| \| \| \| \| \|	Change-Id: I5d84ef72615f9d71b4af210976e2449de6e02326 BUG: 888174 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4446 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>