glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/afr: Make AFR eager-locking similar to EC	Pranith Kumar K	2018-03-14	1	-36/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) Afr's eager-lock only works for data transactions. 2) When there are conflicting writes, write with conflicting region initiates unlock of eager-lock leading to extra pre-ops and post-ops on the file. When eager-lock goes off, it leads to extra fsyncs for random-write workload in afr. Solution (that is modeled after EC): In EC, when there is a conflicting write, it waits for the current write to complete before it winds the conflicted write. This leads to better utilization of network and disk, because we will not be doing extra xattrops and FSYNCs and inodelk/unlock. Moved fd based counters to inode based counters. I tried to model the solution based on EC's locking, but it is not similar to AFR because we had to keep backward compatibility. Lifecycle of lock: ================== First transaction is added to inode->owners list and an inodelk will be sent on the wire. All the next transactions will be put in inode->waiters list until the first transaction completes inodelk and [f]xattrop completely. Once [f]xattrop also completes, all the requests in the inode->waiters list are checked if it conflict with any of the existing locks which are in inode->owners list and if not are added to inode->owners list and resumed with doing transaction. When these transactions complete fop phase they will be moved to inode->post_op list and resume the transactions that were paused because of conflicts. Post-op and unlock will not be issued on the wire until that is the last transaction on that inode. Last transaction when it has to perform post-op can choose to sleep for deyed-post-op-secs value. During that time if any other transaction comes, it will wake up the sleeping transaction and takes over the ownership of the lock and the cycle continues. If the dealyed-post-op-secs expire, then the timer thread will wakeup the sleeping transaction and it will set lock->release to true and starts doing post-op and then unlock. During this time if any other transactions come, they will be put in inode->frozen list. Once the previous unlock comes it will move the frozen list to waiters list and moves the first element from this waiters-list to owners-list and attempts the lock and the cycle continues. This is the general idea. There is logic at the time of dealying and at the time of new transaction or in flush fop to wakeup existing sleeping transactions or choosing whether to delay a transaction etc, which is subjected to change based on future enhancements etc. Fixes: #418 BUG: 1549606 Change-Id: I88b570bbcf332a27c82d2767dfa82472f60055dc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: Change default read policy to gfid-hash	Ashish Pandey	2018-03-14	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Whenever we read data from file over NFS, NFS reads more data then requested and caches it. Based on the stat information it makes sure that the cached/pre-read data is valid or not. Consider 4 + 2 EC volume and all the bricks are on differnt nodes. In EC, with round-robin read policy, reads are sent on different set of data bricks. This way, it balances the read fops to go on all the bricks and avoid heating UP (overloading) same set of bricks. Due to small difference in clock speed, it is possible that we get minor difference for atime, mtime or ctime for different bricks. That might cause a different stat returned to NFS based on which NFS will discard cached/pre-read data which is actually not changed and could be used. Solution: Change read policy for EC as gfid-hash. That will force all the read to go to same set of bricks. Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 BUG: 1554743 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	tests/basic/namespace: Fix the namespace test failure	Varsha Rao	2018-03-14	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \|	In the jenkins regression test brick multiplexing is enabled by is_brick_mx_enabled function and not by setting cluster.brick-multiplex option. Hence check the count of bricks and its logs, this fixes the failure. Change-Id: Ibb2ed8fbffd3765f283da741689304a5579d447c BUG: 1555167 Signed-off-by: Varsha Rao <varao@redhat.com>
*	cluster/ec: avoid delays in self-heal	Xavi Hernandez	2018-03-14	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Self-heal creates a thread per brick to sweep the index looking for files that need to be healed. These threads are started before the volume comes online, so nothing is done but waiting for the next sweep. This happens once per minute. When a replace brick command is executed, the new graph is loaded and all index sweeper threads started. When all bricks have reported, a getxattr request is sent to the root directory of the volume. This causes a heal on it (because the new brick doesn't have good data), and marks its contents as pending to be healed. This is done by the index sweeper thread on the next round, one minute later. This patch solves this problem by waking all index sweeper threads after a successful check on the root directory. Additionally, the index sweep thread scans the index directory sequentially, but it might happen that after healing a directory entry more index entries are created but skipped by the current directory scan. This causes the remaining entries to be processed on the next round, one minute later. The same can happen in the next round, so the heal is running in bursts and taking a lot to finish, specially on volumes with many directory levels. This patch solves this problem by immediately restarting the index sweep if a directory has been healed. Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e BUG: 1547662 Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
*	tests/bug-1110262.t: fix a race condition	Raghavendra G	2018-03-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test does: 1. mount a volume 2. kill a brick in the volume 3. mkdir (/somedir) In my local tests and in [1], I see that mkdir in step 3 fails because there is no dht-layout on root directory. The reason I think is by the time first lookup on "/" hit dht, a brick was killed as per step 2. This means layout was not healed for "/" and since this is a new volume, no layout is present on it. Note that the first lookup done on "/" by fuse-bridge is not synchronized with parent process of daemonized glusterfs mount completing. IOW, by the time glusterfs cmd executed there is no guarantee that lookup on "/" is complete. So, if step 2 races ahead of fuse_first_lookup on "/", we end up with an invalid dht-layout on "/" resulting in failures. Doint an operation like ls makes sure that lookup on "/" is completed before we kill a brick Change-Id: Ie0c4e442c4c629fad6f7ae850437e3d63fe4bea9 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1543279
*	run-tests.sh: added dependency check for netstat	Sven Fischer	2018-03-12	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \|	Because bug-924726.t depends on netstat, tests failed before. This got resolved by adding respective check to run-tests.sh. Enabled respective test again. Change-Id: I70c9bff03379ed9ee8cd95842c3501dfb50b8e86 BUG: 1312830 Signed-off-by: Sven Fischer <sven@fischer-abc.de>
*	tests: don't kill the process directly with KILL signal	Amar Tumballi	2018-03-08	2	-4/+86
\| \| \| \| \| \| \| \| \| \| \|	Instead send the SIGTERM (default, 15) first, and at the end send SIGKILL. If SIGKILL is sent directly, we miss many tests like valgrind, lcov etc., not able to process the information properly. BUG: 1549000 Change-Id: I664de12ee7dbf47eb98b8141004cd51f6006b314 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	hooks: add a script to stat the subdirs in add-brick	Amar Tumballi	2018-03-06	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The subdirectories are expected to be present for a subdir mount to be successful. If not, the client_handshake() itself fails to succeed. When a volume is about to get mounted first time, this is easier to handle, as if the directory is not present in one brick, then its mostly not present in any other brick. In case of add-brick, the directory is not present in new brick, and there is no chance of healing it from the subdirectory mount, as in those clients, the subdir itself will be 'root' ('/') of the filesystem. Hence we need a volume mount to heal the directory before connections can succeed. This patch does take care of that by healing the directories which are expected to be mounted as subdirectories from the volume level mount point. Change-Id: I2c2ac7b7567fe209aaa720006d09b68584d0dd14 BUG: 1549915 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster/afr: Remove compound-fops usage in afr	Pranith Kumar K	2018-03-06	1	-37/+0
\| \| \| \| \| \| \| \| \|	We are not seeing much improvement with this change. So removing the feature so that it doesn't need to be maintained anymore. Fixes: #414 Change-Id: Ic7969b151544daf2547bd262a9fa03f575626411 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	features/shard: Upon FSYNC from upper layers, wind fsync on all changed shards	Krutika Dhananjay	2018-03-05	2	-1/+59
\| \| \| \| \| \|	Change-Id: Ib74354f57a18569762ad45a51f182822a2537421 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	features/shard: Fix shard inode refcount when it's part of priv->lru_list.	Krutika Dhananjay	2018-03-02	3	-17/+45
\| \| \| \| \| \| \| \| \| \| \|	For as long as a shard's inode is in priv->lru_list, it should have a non-zero ref-count. This patch achieves it by taking a ref on the inode when it is added to lru list. When it's time for the inode to be evicted from the lru list, a corresponding unref is done. Change-Id: I289ffb41e7be5df7489c989bc1bbf53377433c86 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	tests/basic/namespace: Check if brick multiplex is enabled	Varsha Rao	2018-02-27	1	-0/+23
\| \| \| \| \| \| \| \| \|	This patch fixes the namespace test failure when brick multiplexing is enabled. By changing the log file name, when brick multiplexing is enabled. As only one log file generated for all bricks. Change-Id: Ide941946e5e1b2676e7139e1b5bf6b93b93c0815 Signed-off-by: Varsha Rao <varao@redhat.com>
*	xlators/features/namespace: Add namespace xlator and link into brick graph	Varsha Rao	2018-02-21	1	-0/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following release-3.8-fb branch patch is upstreamed: > features/namespace: Add namespace xlator and link into brick graph > Commit ID: dbd30776f26e > https://review.gluster.org/#/c/18041/ > By Michael Goulet <mgoulet@fb.com> Changes in this patch: Removes extra config.h and namespace.h file in namespace.c Adds default_getspec_cbk to libglusterfs.sym Rename dict_for_each to dict_foreach_inline Remove fd.h header file stack.h Add test case for truncate, open and symlink This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: Ib88c95b89eecee9b8957df8a4c8712c899c761d1 Signed-off-by: Varsha Rao <varao@redhat.com>
*	tests: Set timeout of 300 for self-heal.t	Nigel Babu	2018-02-21	1	-0/+2
\| \| \| \| \| \| \|	There are a few tests that take more time on regression nodes Change-Id: If126d5ebd422cd6d99125db040e74f0d104af7bc Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	tests: bring option of per test timeout	Amar Tumballi	2018-02-15	3	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This uses 'timeout' command with 300 seconds default. Right now, there is just 1 test which takes more than that in a properly setup machine. Ideally best case is set the default to something like 30 seconds, and if a test is supposed to take more than that, owner should add a timeout line to test knowingly. That way, it makes test writers think about a time limit too. Change-Id: I747005ce1f208aeb2ecbf899e8feea487ecd21a0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests: fix spurious test failure	Atin Mukherjee	2018-02-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	In bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t check for peer count after starting glusterd instance on node 2 Change-Id: I3f92013719d94b6d92fb5db25efef1fb4b41d510 BUG: 1540607 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	gfapi: return pre/post attributes at callback for glfs api	Kinglong Mee	2018-02-12	2	-2/+4
\| \| \| \| \| \|	Updates: #389 Change-Id: Ic71632722effe4b8855d5de3e65688efd9afe1e3 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	gfapi: return pre/post attributes from glfs_ftruncate	Kinglong Mee	2018-02-12	1	-1/+1
\| \| \| \| \| \|	Updates: #389 Change-Id: I8faea0828921fb17f05f7321c3cb01747373f21e Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	gfapi: return pre/post attributes from glfs_pread/pwrite	Kinglong Mee	2018-02-12	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As nfs-ganesha, a wcc data contains pre/post attributes is return in read/write rpc reply. nfs-ganesha get those attributes by two getattr between the real read/write right now. But, gluster has return pre/post attributes from glusterfsd, those attributes are skipped in syncop/gfapi, if gfapi return them, the upper user (nfs-ganesha) can use them directly without any duplicate getattr. Updates: #389 Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	glusterd: optimization of test cases	Sanju Rakonde	2018-02-10	93	-2696/+1575
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To reduce the overall time taken by the every regression job for all glusterd test cases, avoiding some duplicate tests by clubbing similar test cases into one. real time taken for all regression jobs of glusterd without this patch is 1959 seconds, with this patch it is 1059 seconds. Look at the below document for your reference. https://docs.google.com/document/d/1u8o4-wocrsuPDI8BwuBU6yi_x4xA_pf2qSrFY6WEQpo/edit?usp=sharing Change-Id: Ib14c61ace97e62c3abce47230dd40598640fe9cb BUG: 1530905 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	glusterd/snapshot : fix the compare snap logic	Atin Mukherjee	2018-02-10	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	In one of the case in commit cb0339f there's one particular case where after removing the old snap it wasn't writing the new snap version and this resulted into one of the test to fail spuriously. Change-Id: I3e83435fb62d6bba3bbe227e40decc6ce37ea77b BUG: 1540607 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	performance/io-threads: expose io-thread queue depths	Varsha Rao	2018-02-08	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following release-3.8-fb branch patch is upstreamed: > io-stats: Expose io-thread queue depths > Commit ID: 69509ee7d2 > https://review.gluster.org/#/c/18143/ > By Shreyas Siravara <sshreyas@fb.com> Changes in this patch: - Replace iot_pri_t with gf_fop_pri_t - Replace IOT_PRI_{HI, LO, NORMAL, MAX, LEAST} with GF_FOP_PRI_{HI, LO, NORMAL, MAX, LEAST} - Use dict_unref() instead of dict_destroy() This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: I1b47a63185a441a30fbc423ca1015df7b36c2518 Signed-off-by: Varsha Rao <varao@redhat.com>
*	tests/dht: Non-root can delete stale linkto files	N Balachandran	2018-02-08	1	-0/+51
\| \| \| \| \| \| \| \| \|	Test to check that non-root users can delete stale linkto files Change-Id: Ic9bc76bc485cab839927af60cfce78a058eee2e4 BUG: 1542318 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: avoid overwriting client writes during migration	Susant Palai	2018-02-02	2	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For more details on this issue see https://github.com/gluster/glusterfs/issues/308 Solution: This is a restrictive solution where a file will not be migrated if a client writes to it during the migration. This does not check if the writes from the rebalance and the client actually do overlap. If dht_writev_cbk finds that the file is being migrated (PHASE1) it will set an xattr on the destination file indicating the file was updated by a non-rebalance client. Rebalance checks if any other client has written to the dst file and aborts the file migration if it finds the xattr. updates gluster/glusterfs#308 Change-Id: I73aec28bc9dbb8da57c7425ec88c6b6af0fbc9dd Signed-off-by: Susant Palai <spalai@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	sdfs: crash fixes	Amar Tumballi	2018-02-01	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	* from the patch which got tested in experimental branch, there was a code cleanup involved, which missed setting of a local variable, which led to crash immediately after enabling the feature. * added a sanity test case to validate all the fops of sdfs. Updates: #397 Change-Id: I7e0bebfc195c344620577cb16c1afc5f4e7d2d92 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr: don't treat all cases all bricks being blamed as split-brain	Ravishankar N	2018-02-01	3	-0/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks while I/O or heal are in progress. Fix: Instead of undoing the post-op, pick a source based on the xattr values. If 2 bricks blame one, the blamed one must be treated as sink. If there is no majority, all are sources. Once we pick a source, self-heal will then do the heal instead of erroring out due to split-brain. Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae BUG: 1539358 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: fix tests/bugs/fuse/bug-858215.t	Csaba Henk	2018-01-31	1	-2/+2
\| \| \| \| \|	Change-Id: Ifbf5e628ccb9a0ecb285f5884a41e70d935316bd Signed-off-by: Csaba Henk <csaba@redhat.com>
*	quiesce, gfproxy: Implement failover across multiple gfproxy nodes	Poornima G	2018-01-30	1	-0/+2
\| \| \| \| \| \|	Updates: #242 Change-Id: I767e574a26e922760a7130bd209c178d74e8cf69 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	tests: Disable geo-rep tests	Nigel Babu	2018-01-24	2	-0/+2
\| \| \| \| \| \| \| \| \|	These tests are prone to issues at the moment that need further debugging and fixing. BUG: 1537602 Change-Id: Ic59ca620925c6f43948b8a751eaddb571b791969 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	libgfapi: Add new api for supporting mandatory-locks	Anoop C S	2018-01-22	3	-1/+543
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current API for byte-range locks [glfs_posix_lock()] doesn't allow applications to specify whether it is advisory or mandatory type locks. This particular change is to introduce an extended byte-range lock API with an additional argument for including the byte-range lock mode to be one among advisory(default) or mandatory. Patch also includes a gfapi test case which make use of this new api to acquire mandatory locks. Ref: https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.8/Mandatory%20Locks.md Change-Id: Ia09042c755d891895d96da857321abc4ce03e20c Updates #393 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	md-cache: Implement dynamic configuration of xattr list for caching	Poornima G	2018-01-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the list of xattrs that md-cache can cache is hard coded in the md-cache.c file, this necessiates code change and rebuild everytime a new xattr needs to be added to md-cache xattr cache list. With this patch, the user will be able to configure a comma seperated list of xattrs to be cached by md-cache Updates #297 Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e Signed-off-by: Poornima G <pgurusid@redhat.com>
*	afr: add quorum checks in post-op	Ravishankar N	2018-01-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	afr relies on pending changelog xattrs to identify source and sinks and the setting of these xattrs happen in post-op. So if post-op fails, we need to unwind the write txn with a failure. Change-Id: I0f019ac03890108324ee7672883d774918b20be1 BUG: 1506140 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	upcall: Allow md-cache to specify invalidations on xattr with wildcard	Poornima G	2018-01-19	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, md-cache sends a list of xattrs, it is inttrested in recieving invalidations for. But, it cannot specify any wildcard in the xattr names Eg: user.* - invalidate on updating any xattr with user. prefix. This patch, enable upcall to honor wildcard in the xattr key names Updates: #297 Change-Id: I98caf0ed72f11ef10770bf2067d4428880e0a03a Signed-off-by: Poornima G <pgurusid@redhat.com>
*	geo-rep: Validate availability of gluster binary on slave	Kotresh HR	2018-01-19	1	-0/+65
\| \| \| \| \| \| \| \| \| \|	1. Adds validation to check if gluster binary is available on slave 2. Add a simple geo-rep setup test case to verify whether setup is fine. It's named in such a way that it runs first. BUG: 1532591 Change-Id: Ie777e55ae13db8fa97d4e32464ad82269ee5fd07 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	make sure geo-rep tests run first	Amar Tumballi	2018-01-17	2	-0/+0
\| \| \| \| \| \| \| \| \|	as we run regression with a 'sort' function, geo-rep thus becomes last test to run. instead, make sure it is the first test by changing the name of directory, and thus any setup failures would be noticed much earlier. Change-Id: I9e8d81824274900be42c4c49c752a1602497fa31 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	locks: added inodelk/entrylk contention upcall notifications	Xavier Hernandez	2018-01-16	2	-4/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The locks xlator now is able to send a contention notification to the current owner of the lock. This is only a notification that can be used to improve performance of some client side operations that might benefit from extended duration of lock ownership. Nothing is done if the lock owner decides to ignore the message and to not release the lock. For forced release of acquired resources, leases must be used. Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
*	posix: delete stale gfid handles in nameless lookup	Ravishankar N	2018-01-16	1	-0/+65
\| \| \| \| \| \| \| \| \|	..in order for self-heal of symlinks to work properly (see BZ for details). Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501 BUG: 1529488 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: EC test fails with brick mux enabled	Sunil Kumar Acharya	2018-01-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: With brick mux enabled get_fd_count count was returning wrong value due to parsing issue. Solution: Updated the code to fix parsing problem. BUG: 1533594 Change-Id: I5d7ff6843b4760f866c4a5aab2f13ff7380f248e Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	cluster/ec Mark ./tests/basic/ec/heal-info.t as bad test	Ashish Pandey	2018-01-12	1	-0/+1
\| \| \| \| \| \|	Change-Id: I7369fdd7510cc7ebf051cc621fc83764ba9591f3 BUG: 1533815 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	tests: check volume status for shd being up	Ravishankar N	2018-01-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	so that glusterd is also aware that shd is up and running. While not reproducible locally, on the jenkins slaves, 'gluster vol heal patchy' fails with "Self-heal daemon is not running. Check self-heal daemon log file.", while infact the afr_child_up_status_in_shd() checks before that passed. In the shd log also, I see the shd being up and connected to at least one brick before the heal is launched. Change-Id: Id3801fa4ab56a70b1f0bd6a7e240f69bea74a5fc BUG: 1515163 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: Use /dev/urandom instead of /dev/random for dd	Pranith Kumar K	2018-01-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	If there's not enough entropy in the system then reading /dev/random would take a significant time since it would take a long time for the /dev/random buffers to get full as is desired in this dd run. Milind found that this test file takes almost a 1000 seconds or more to pass instead of just a minute because of this. BUG: 1431955 Change-Id: I9145b17f77f09d0ab71816ae249c69b8fe14c1a5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	Revert "rpc: merge ssl infra with epoll infra"	Milind Changire	2018-01-07	1	-1/+0
\| \| \| \| \| \| \|	This reverts commit 56e5fdae74845dfec0ff7ad0c8fee77695d36ad5. Change-Id: Ia62cee5440bbe8e23f5da9cff692d792091d544a Signed-off-by: Milind Changire <mchangir@redhat.com>
*	tests: Enable geo-rep test cases	Kotresh HR	2018-01-05	4	-265/+331
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch re-enables the geo-rep test cases. Along with it does following optimizations. 1. Use EXPECT_WITHIN instead of sleep 2. Clean up geo-rep ssh key after test 3. Changes to gverify.sh and S56glusterd-geo-rep-create-post.sh to use the given ssh identity file for geo-rep create 4. Make gluster-command-dir configurable and introduce slave-gluster-command-dir which points the parent directory of gluster binaries in master and slave respectively. Change-Id: Ia7696278d9dd3ba04224dcd7c3564088ca970b04 BUG: 1480491 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	cluster/ec: OpenFD heal implementation for EC	Sunil Kumar Acharya	2018-01-05	3	-11/+122
\| \| \| \| \| \| \| \| \| \| \| \| \|	Existing EC code doesn't try to heal the OpenFD to avoid unnecessary healing of the data later. Fix implements the healing of open FDs before carrying out file operations on them by making an attempt to open the FDs on required up nodes. BUG: 1431955 Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	cluster/dht: Use percentages for space check	N Balachandran	2018-01-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With heterogenous bricks now being supported in DHT we could run into issues where files are not migrated even though there is sufficient space in newly added bricks which just happen to be considerably smaller than older bricks. Using percentages instead of absolute available space for space checks can mitigate that to some extent. Marking bug-1247563.t as that used to depend on the easier code to prevent a file from migrating. This will be removed once we find a way to force a file migration failure. Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647 BUG: 1529440 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/ec: Change [f]getxattr to parallel-dispatch-one	Pranith Kumar K	2017-12-22	2	-0/+173
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment in EC, [f]getxattr operations wait to acquire a lock while other operations are in progress even when it is in the same mount with a lock on the file/directory. This happens because [f]getxattr operations follow the model where the operation is wound on 'k' of the bricks and are matched to make sure the data returned is same on all of them. This consistency check requires that no other operations are on-going while [f]getxattr operations are wound to the bricks. We can perform [f]getxattr in another way as well, where we find the good_mask from the lock that is already granted and wind the operation on any one of the good bricks and unwind the answer after adjusting size/blocks to the parent xlator. Since we are taking into account good_mask, the reply we get will either be before or after a possible on-going operation. Using this method, the operation doesn't need to depend on completion of on-going operations which could be taking long time (In case of some slow disks and writes are in progress etc). Thus we reduce the time to serve [f]getxattr requests. I changed [f]getxattr to dispatch-one and added extra logic in ec_link_has_lock_conflict() to not have any conflicts for fops with EC_MINIMUM_ONE as fop->minimum to achieve the effect described above. Modified scripts to make sure READ fop is received in EC to trigger heals. Updates gluster/glusterfs#368 Change-Id: I3b4ebf89181c336b7b8d5471b0454f016cdaf296 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	dht: Fill first_up_subvol before use in dht_opendir	Poornima G	2017-12-15	1	-0/+23
\| \| \| \| \| \| \| \|	Reported by: Sam McLeod Change-Id: Ic8f9b46b173796afd70aff1042834b03ac3e80b2 BUG: 1512437 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	quick-read: Integrate quick read with upcall and increase cache time	Poornima G	2017-12-13	1	-0/+69
\| \| \| \| \| \| \|	Fixes : #261 Co-author: Subha sree Mohankumar <smohanku@redhat.com> Change-Id: Ie9dd94e86459123663b9b200d92940625ef68eab Signed-off-by: Poornima G <pgurusid@redhat.com>
*	quick-read: Discard cache for fallocate, zerofill and discard ops	Sachin Prabhu	2017-12-13	2	-0/+228
\| \| \| \| \| \| \| \| \|	The fallocate, zerofill and discard modify file data on the server thus rendering stale any cache held by the xlator on the client. BUG: 1524252 Change-Id: I432146c6390a0cd5869420c373f598da43915f3f Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
*	rpc: merge ssl infra with epoll infra	Milind Changire	2017-12-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch attempts to use the epoll infra for handling SSL connections as well instead of the socket_poller() thread func. This essentially makes priv->own_thread flag redundant. SSL_connect()/SSL_accept() is now non-blocking which has done away with the localised poll() in ssl_do(). So, ssl_do() has been updated appropriately. own_thread and coincidently socket_poller() thread for SSL processing is now deprecated. Added a timeout to test whether seal-heal daemon is up and running as per Ravi's suggestion. Change-Id: If2b5d7b4fd19e321cb289e08d49a718d2161aafe Signed-off-by: Milind Changire <mchangir@redhat.com>